Unknown

Dataset Information

0

DUDE-Seq: Fast, flexible, and robust denoising for targeted amplicon sequencing.


ABSTRACT: We consider the correction of errors from nucleotide sequences produced by next-generation targeted amplicon sequencing. The next-generation sequencing (NGS) platforms can provide a great deal of sequencing data thanks to their high throughput, but the associated error rates often tend to be high. Denoising in high-throughput sequencing has thus become a crucial process for boosting the reliability of downstream analyses. Our methodology, named DUDE-Seq, is derived from a general setting of reconstructing finite-valued source data corrupted by a discrete memoryless channel and effectively corrects substitution and homopolymer indel errors, the two major types of sequencing errors in most high-throughput targeted amplicon sequencing platforms. Our experimental studies with real and simulated datasets suggest that the proposed DUDE-Seq not only outperforms existing alternatives in terms of error-correction capability and time efficiency, but also boosts the reliability of downstream analyses. Further, the flexibility of DUDE-Seq enables its robust application to different sequencing platforms and analysis pipelines by simple updates of the noise model. DUDE-Seq is available at http://data.snu.ac.kr/pub/dude-seq.

SUBMITTER: Lee B 

PROVIDER: S-EPMC5531809 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5876661 | biostudies-literature
| S-EPMC6765106 | biostudies-literature
| S-EPMC4178461 | biostudies-literature
| S-EPMC5324676 | biostudies-literature
| S-EPMC6820467 | biostudies-literature
| S-EPMC4302139 | biostudies-literature
| S-EPMC3827295 | biostudies-literature
| S-EPMC4850673 | biostudies-literature
| S-EPMC5540949 | biostudies-literature
| S-EPMC7672904 | biostudies-literature