Unknown

Dataset Information

0

Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags.


ABSTRACT: Background:Illumina paired-end sequencing has been increasingly popular for 16S rRNA gene-based microbiota profiling. It provides higher phylogenetic resolution than single-end reads due to a longer read length. However, the reverse read (R2) often has significant low base quality, and a large proportion of R2s will be discarded after quality control, resulting in a mixture of paired-end and single-end reads. A typical 16S analysis pipeline usually processes either paired-end or single-end reads but not a mixture. Thus, the quantification accuracy and statistical power will be reduced due to the loss of a large amount of reads. As a result, rare taxa may not be detectable with the paired-end approach, or low taxonomic resolution will result in a single-end approach. Results:To have both the higher phylogenetic resolution provided by paired-end reads and the higher sequence coverage by single-end reads, we propose a novel OTU-picking pipeline, hybrid-denovo, that can process a hybrid of single-end and paired-end reads. Using high-quality paired-end reads as a gold standard, we show that hybrid-denovo achieved the highest correlation with the gold standard and performed better than the approaches based on paired-end or single-end reads in terms of quantifying the microbial diversity and taxonomic abundances. By applying our method to a rheumatoid arthritis (RA) data set, we demonstrated that hybrid-denovo captured more microbial diversity and identified more RA-associated taxa than a paired-end or single-end approach. Conclusions:Hybrid-denovo utilizes both paired-end and single-end 16S sequencing reads and is recommended for 16S rRNA gene targeted paired-end sequencing data.

SUBMITTER: Chen X 

PROVIDER: S-EPMC5841375 | biostudies-literature | 2018 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags.

Chen Xianfeng X   Johnson Stephen S   Jeraldo Patricio P   Wang Junwen J   Chia Nicholas N   Kocher Jean-Pierre A JA   Chen Jun J  

GigaScience 20180301 3


<h4>Background</h4>Illumina paired-end sequencing has been increasingly popular for 16S rRNA gene-based microbiota profiling. It provides higher phylogenetic resolution than single-end reads due to a longer read length. However, the reverse read (R2) often has significant low base quality, and a large proportion of R2s will be discarded after quality control, resulting in a mixture of paired-end and single-end reads. A typical 16S analysis pipeline usually processes either paired-end or single-e  ...[more]

Similar Datasets

| S-EPMC3599145 | biostudies-literature
| S-EPMC6476724 | biostudies-literature
| S-EPMC4593230 | biostudies-literature
2012-07-19 | GSE39495 | GEO
| S-EPMC3158087 | biostudies-literature
2012-07-19 | E-GEOD-39495 | biostudies-arrayexpress
| S-EPMC4266640 | biostudies-literature
| S-EPMC4179863 | biostudies-literature
| S-EPMC3635884 | biostudies-literature
| S-EPMC3076424 | biostudies-literature