Unknown

Dataset Information

0

PANDAseq: paired-end assembler for illumina sequences.


ABSTRACT: BACKGROUND: Illumina paired-end reads are used to analyse microbial communities by targeting amplicons of the 16S rRNA gene. Publicly available tools are needed to assemble overlapping paired-end reads while correcting mismatches and uncalled bases; many errors could be corrected to obtain higher sequence yields using quality information. RESULTS: PANDAseq assembles paired-end reads rapidly and with the correction of most errors. Uncertain error corrections come from reads with many low-quality bases identified by upstream processing. Benchmarks were done using real error masks on simulated data, a pure source template, and a pooled template of genomic DNA from known organisms. PANDAseq assembled reads more rapidly and with reduced error incorporation compared to alternative methods. CONCLUSIONS: PANDAseq rapidly assembles sequences and scales to billions of paired-end reads. Assembly of control libraries showed a 4-50% increase in the number of assembled sequences over naïve assembly with negligible loss of "good" sequence.

SUBMITTER: Masella AP 

PROVIDER: S-EPMC3471323 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

altmetric image

Publications

PANDAseq: paired-end assembler for illumina sequences.

Masella Andre P AP   Bartram Andrea K AK   Truszkowski Jakub M JM   Brown Daniel G DG   Neufeld Josh D JD  

BMC bioinformatics 20120214


<h4>Background</h4>Illumina paired-end reads are used to analyse microbial communities by targeting amplicons of the 16S rRNA gene. Publicly available tools are needed to assemble overlapping paired-end reads while correcting mismatches and uncalled bases; many errors could be corrected to obtain higher sequence yields using quality information.<h4>Results</h4>PANDAseq assembles paired-end reads rapidly and with the correction of most errors. Uncertain error corrections come from reads with many  ...[more]

Similar Datasets

| S-EPMC4583709 | biostudies-literature
| S-EPMC7071698 | biostudies-literature
| PRJEB4878 | ENA
| S-EPMC7672897 | biostudies-literature
| S-EPMC3483553 | biostudies-literature
| S-EPMC4281075 | biostudies-literature
| S-EPMC3585167 | biostudies-literature
| S-EPMC6393434 | biostudies-literature
| S-EPMC4023940 | biostudies-literature
| S-EPMC4545970 | biostudies-literature