Unknown

Dataset Information

0

Within-species contamination of bacterial whole-genome sequence data has a greater influence on clustering analyses than between-species contamination.


ABSTRACT: Although it is assumed that contamination in bacterial whole-genome sequencing causes errors, the influences of contamination on clustering analyses, such as single-nucleotide polymorphism discovery, phylogenetics, and multi-locus sequencing typing, have not been quantified. By developing and analyzing 720 Listeria monocytogenes, Salmonella enterica, and Escherichia coli short-read datasets, we demonstrate that within-species contamination causes errors that confound clustering analyses, while between-species contamination generally does not. Contaminant reads mapping to references or becoming incorporated into chimeric sequences during assembly are the sources of those errors. Contamination sufficient to influence clustering analyses is present in public sequence databases.

SUBMITTER: Pightling AW 

PROVIDER: S-EPMC6918607 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Within-species contamination of bacterial whole-genome sequence data has a greater influence on clustering analyses than between-species contamination.

Pightling Arthur W AW   Pettengill James B JB   Wang Yu Y   Rand Hugh H   Strain Errol E  

Genome biology 20191218 1


Although it is assumed that contamination in bacterial whole-genome sequencing causes errors, the influences of contamination on clustering analyses, such as single-nucleotide polymorphism discovery, phylogenetics, and multi-locus sequencing typing, have not been quantified. By developing and analyzing 720 Listeria monocytogenes, Salmonella enterica, and Escherichia coli short-read datasets, we demonstrate that within-species contamination causes errors that confound clustering analyses, while b  ...[more]

Similar Datasets

| S-EPMC6546082 | biostudies-literature
2016-09-22 | GSE78756 | GEO
| S-EPMC3110597 | biostudies-literature
| S-EPMC4228153 | biostudies-literature
| S-EPMC6838224 | biostudies-literature
| S-EPMC3406127 | biostudies-literature
| S-EPMC7525026 | biostudies-literature
| S-EPMC8023143 | biostudies-literature
| S-EPMC4300949 | biostudies-literature
| S-EPMC3008232 | biostudies-literature