Unknown

Dataset Information

0

Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency.


ABSTRACT: BACKGROUND: Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However, at present, taxonomic binning tools for metagenomes discard 30-40% of Sanger sequencing data due to the stringency of BLAST cut-offs. In an attempt to provide a comprehensive overview of metagenomic data, we re-analyzed the discarded metagenomes by using less stringent cut-offs. Additionally, we introduced a new criterion, namely, the evolutionary conservation of adjacency between neighboring genes. To evaluate the feasibility of our approach, we re-analyzed discarded contigs and singletons from several environments with different levels of complexity. We also compared the consistency between our taxonomic binning and those reported in the original studies. RESULTS: Among the discarded data, we found that 23.7 ± 3.9% of singletons and 14.1 ± 1.0% of contigs were assigned to taxa. The recovery rates for singletons were higher than those for contigs. The Pearson correlation coefficient revealed a high degree of similarity (0.94 ± 0.03 at the phylum rank and 0.80 ± 0.11 at the family rank) between the proposed taxonomic binning approach and those reported in original studies. In addition, an evaluation using simulated data demonstrated the reliability of the proposed approach. CONCLUSIONS: Our findings suggest that taking account of conserved neighboring gene adjacency improves taxonomic assignment when analyzing metagenomes using Sanger sequencing. In other words, utilizing the conserved gene order as a criterion will reduce the amount of data discarded when analyzing metagenomes.

SUBMITTER: Weng FC 

PROVIDER: S-EPMC3098102 | biostudies-literature | 2010

REPOSITORIES: biostudies-literature

altmetric image

Publications

Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency.

Weng Francis C FC   Su Chien-Hao CH   Hsu Ming-Tsung MT   Wang Tse-Yi TY   Tsai Huai-Kuang HK   Wang Daryi D  

BMC bioinformatics 20101118


<h4>Background</h4>Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However, at present, taxonomic binning tools for metagenomes discard 30-40% of Sanger sequencing data due to the stringency of BLAST cut-offs. In an attempt to provide a comprehensive overview of metagenomic data, we re-analyzed the discarded metagenomes  ...[more]

Similar Datasets

| S-EPMC7214025 | biostudies-literature
| S-EPMC307580 | biostudies-literature
| S-EPMC2874554 | biostudies-literature
| S-EPMC3018814 | biostudies-other
| S-EPMC4479572 | biostudies-literature
| S-EPMC2841443 | biostudies-literature
| S-EPMC7299307 | biostudies-literature
| S-EPMC3232206 | biostudies-literature
| S-EPMC6676585 | biostudies-literature
| S-EPMC9245163 | biostudies-literature