Unknown

Dataset Information

0

Common contaminants in next-generation sequencing that hinder discovery of low-abundance microbes.


ABSTRACT: Unbiased high-throughput sequencing of whole metagenome shotgun DNA libraries is a promising new approach to identifying microbes in clinical specimens, which, unlike other techniques, is not limited to known sequences. Unlike most sequencing applications, it is highly sensitive to laboratory contaminants as these will appear to originate from the clinical specimens. To assess the extent and diversity of sequence contaminants, we aligned 57 "1000 Genomes Project" sequencing runs from six centers against the four largest NCBI BLAST databases, detecting reads of diverse contaminant species in all runs and identifying the most common of these contaminant genera (Bradyrhizobium) in assembled genomes from the NCBI Genome database. Many of these microorganisms have been reported as contaminants of ultrapure water systems. Studies aiming to identify novel microbes in clinical specimens will greatly benefit from not only preventive measures such as extensive UV irradiation of water and cross-validation using independent techniques, but also a concerted effort to sequence the complete genomes of common contaminants so that they may be subtracted computationally.

SUBMITTER: Laurence M 

PROVIDER: S-EPMC4023998 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Common contaminants in next-generation sequencing that hinder discovery of low-abundance microbes.

Laurence Martin M   Hatzis Christos C   Brash Douglas E DE  

PloS one 20140516 5


Unbiased high-throughput sequencing of whole metagenome shotgun DNA libraries is a promising new approach to identifying microbes in clinical specimens, which, unlike other techniques, is not limited to known sequences. Unlike most sequencing applications, it is highly sensitive to laboratory contaminants as these will appear to originate from the clinical specimens. To assess the extent and diversity of sequence contaminants, we aligned 57 "1000 Genomes Project" sequencing runs from six centers  ...[more]

Similar Datasets

| S-EPMC3933208 | biostudies-other
| S-EPMC3250192 | biostudies-literature
2017-04-03 | PXD003804 | Pride
| S-EPMC3562067 | biostudies-literature
| S-EPMC4148959 | biostudies-literature
| S-EPMC8479661 | biostudies-literature
| S-EPMC4011907 | biostudies-other
| S-EPMC4239701 | biostudies-literature
| S-EPMC8112455 | biostudies-literature
| S-EPMC8112455 | biostudies-literature