Ontology highlight
ABSTRACT: Motivation
Pooling multiple samples increases the efficiency and lowers the cost of DNA sequencing. One approach to multiplexing is to use short DNA indices to uniquely identify each sample. After sequencing, reads must be assigned in silico to the sample of origin, a process referred to as demultiplexing. Demultiplexing software typically identifies the sample of origin using a fixed number of mismatches between the read index and a reference index set. This approach may fail or misassign reads when the sequencing quality of the indices is poor.Results
We introduce deML, a maximum likelihood algorithm that demultiplexes Illumina sequences. deML computes the likelihood of an observed index sequence being derived from a specified sample. A quality score which reflects the probability of the assignment being correct is generated for each read. Using these quality scores, even very problematic datasets can be demultiplexed and an error threshold can be set.Availability and implementation
deML is freely available for use under the GPL (http://bioinf.eva.mpg.de/deml/).
SUBMITTER: Renaud G
PROVIDER: S-EPMC4341068 | biostudies-literature | 2015 Mar
REPOSITORIES: biostudies-literature
Renaud Gabriel G Stenzel Udo U Maricic Tomislav T Wiebe Victor V Kelso Janet J
Bioinformatics (Oxford, England) 20141030 5
<h4>Motivation</h4>Pooling multiple samples increases the efficiency and lowers the cost of DNA sequencing. One approach to multiplexing is to use short DNA indices to uniquely identify each sample. After sequencing, reads must be assigned in silico to the sample of origin, a process referred to as demultiplexing. Demultiplexing software typically identifies the sample of origin using a fixed number of mismatches between the read index and a reference index set. This approach may fail or misassi ...[more]