Unknown

Dataset Information

0

Coval: improving alignment quality and variant calling accuracy for next-generation sequencing data.


ABSTRACT: Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in 'targeted' alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/.

SUBMITTER: Kosugi S 

PROVIDER: S-EPMC3792961 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

Coval: improving alignment quality and variant calling accuracy for next-generation sequencing data.

Kosugi Shunichi S   Natsume Satoshi S   Yoshida Kentaro K   MacLean Daniel D   Cano Liliana L   Kamoun Sophien S   Terauchi Ryohei R  

PloS one 20131008 10


Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained i  ...[more]

Similar Datasets

| S-EPMC5324109 | biostudies-literature
| S-EPMC10794290 | biostudies-literature
| S-EPMC4129436 | biostudies-literature
| S-EPMC3563481 | biostudies-literature
| S-EPMC3201884 | biostudies-literature
| S-EPMC5374681 | biostudies-literature
| S-EPMC3907006 | biostudies-literature
| S-EPMC3218665 | biostudies-literature
| S-EPMC4265454 | biostudies-literature
| S-EPMC3493122 | biostudies-literature