Unknown

Dataset Information

0

Pluribus-Exploring the Limits of Error Correction Using a Suffix Tree.


ABSTRACT: Next generation sequencing technologies enable efficient and cost-effective genome sequencing. However, sequencing errors increase the complexity of the de novo assembly process, and reduce the quality of the assembled sequences. Many error correction techniques utilizing substring frequencies have been developed to mitigate this effect. In this paper, we present a novel and effective method called Pluribus, for correcting sequencing errors using a generalized suffix trie. Pluribus utilizes multiple manifestations of an error in the trie to accurately identify errors and suggest corrections. We show that Pluribus produces the least number of false positives across a diverse set of real sequencing datasets when compared to other methods. Furthermore, Pluribus can be used in conjunction with other contemporary error correction methods to achieve higher levels of accuracy than either tool alone. These increases in error correction accuracy are also realized in the quality of the contigs that are generated during assembly. We explore, in-depth, the behavior of Pluribus , to explain the observed improvement in accuracy and assembly performance. Pluribus is freely available at http://compbio. CASE:edu/pluribus/.

SUBMITTER: Savel D 

PROVIDER: S-EPMC5754272 | biostudies-literature | 2017 Nov-Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Pluribus-Exploring the Limits of Error Correction Using a Suffix Tree.

Savel Daniel D   LaFramboise Thomas T   Grama Ananth A   Koyuturk Mehmet M  

IEEE/ACM transactions on computational biology and bioinformatics 20160629 6


Next generation sequencing technologies enable efficient and cost-effective genome sequencing. However, sequencing errors increase the complexity of the de novo assembly process, and reduce the quality of the assembled sequences. Many error correction techniques utilizing substring frequencies have been developed to mitigate this effect. In this paper, we present a novel and effective method called Pluribus, for correcting sequencing errors using a generalized suffix trie. Pluribus utilizes mult  ...[more]

Similar Datasets

| S-EPMC3526801 | biostudies-literature
| S-EPMC7197101 | biostudies-literature
| S-EPMC4393519 | biostudies-literature
| S-EPMC5704532 | biostudies-literature
| S-EPMC8292750 | biostudies-literature
| S-EPMC10213493 | biostudies-literature
| S-EPMC3273826 | biostudies-other
| S-EPMC4118789 | biostudies-literature
| S-EPMC9206251 | biostudies-literature
2021-01-31 | GSE162053 | GEO