Unknown

Dataset Information

0

Incorporating sequence quality data into alignment improves DNA read mapping.


ABSTRACT: New DNA sequencing technologies have achieved breakthroughs in throughput, at the expense of higher error rates. The primary way of interpreting biological sequences is via alignment, but standard alignment methods assume the sequences are accurate. Here, we describe how to incorporate the per-base error probabilities reported by sequencers into alignment. Unlike existing tools for DNA read mapping, our method models both sequencer errors and real sequence differences. This approach consistently improves mapping accuracy, even when the rate of real sequence difference is only 0.2%. Furthermore, when mapping Drosophila melanogaster reads to the Drosophila simulans genome, it increased the amount of correctly mapped reads from 49 to 66%. This approach enables more effective use of DNA reads from organisms that lack reference genomes, are extinct or are highly polymorphic.

SUBMITTER: Frith MC 

PROVIDER: S-EPMC2853142 | biostudies-literature | 2010 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Incorporating sequence quality data into alignment improves DNA read mapping.

Frith Martin C MC   Wan Raymond R   Horton Paul P  

Nucleic acids research 20100127 7


New DNA sequencing technologies have achieved breakthroughs in throughput, at the expense of higher error rates. The primary way of interpreting biological sequences is via alignment, but standard alignment methods assume the sequences are accurate. Here, we describe how to incorporate the per-base error probabilities reported by sequencers into alignment. Unlike existing tools for DNA read mapping, our method models both sequencer errors and real sequence differences. This approach consistently  ...[more]

Similar Datasets

| S-EPMC2335322 | biostudies-literature
| S-EPMC5860160 | biostudies-literature
| S-EPMC3534618 | biostudies-literature
| S-EPMC6602515 | biostudies-literature
2013-07-15 | E-MTAB-1728 | biostudies-arrayexpress
| S-EPMC3585936 | biostudies-literature
| S-EPMC6657586 | biostudies-literature