Unknown

Dataset Information

0

Pan-genome Analyses of the Species Salmonella enterica, and Identification of Genomic Markers Predictive for Species, Subspecies, and Serovar.


ABSTRACT: Food safety is a global concern, with upward of 2.2 million deaths due to enteric disease every year. Current whole-genome sequencing platforms allow routine sequencing of enteric pathogens for surveillance, and during outbreaks; however, a remaining challenge is the identification of genomic markers that are predictive of strain groups that pose the most significant health threats to humans, or that can persist in specific environments. We have previously developed the software program Panseq, which identifies the pan-genome among a group of sequences, and the SuperPhy platform, which utilizes this pan-genome information to identify biomarkers that are predictive of groups of bacterial strains. In this study, we examined the pan-genome of 4893 genomes of Salmonella enterica, an enteric pathogen responsible for the loss of more disability adjusted life years than any other enteric pathogen. We identified a pan-genome of 25.3 Mbp, a strict core of 1.5 Mbp present in all genomes, and a conserved core of 3.2 Mbp found in at least 96% of these genomes. We also identified 404 genomic regions of 1000 bp that were specific to the species S. enterica. These species-specific regions were found to encode mostly hypothetical proteins, effectors, and other proteins related to virulence. For each of the six S. enterica subspecies, markers unique to each were identified. No serovar had pan-genome regions that were present in all of its genomes and absent in all other serovars; however, each serovar did have genomic regions that were universally present among all constituent members, and statistically predictive of the serovar. The phylogeny based on SNPs within the conserved core genome was found to be highly concordant to that produced by a phylogeny using the presence/absence of 1000 bp regions of the entire pan-genome. Future studies could use these predictive regions as components of a vaccine to prevent salmonellosis, as well as in simple and rapid diagnostic tests for both in silico and wet-lab applications, with uses ranging from food safety to public health. Lastly, the tools and methods described in this study could be applied as a pan-genomics framework to other population genomic studies seeking to identify markers for other bacterial species and their sub-groups.

SUBMITTER: Laing CR 

PROVIDER: S-EPMC5534482 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

altmetric image

Publications

Pan-genome Analyses of the Species <i>Salmonella enterica</i>, and Identification of Genomic Markers Predictive for Species, Subspecies, and Serovar.

Laing Chad R CR   Whiteside Matthew D MD   Gannon Victor P J VPJ  

Frontiers in microbiology 20170731


Food safety is a global concern, with upward of 2.2 million deaths due to enteric disease every year. Current whole-genome sequencing platforms allow routine sequencing of enteric pathogens for surveillance, and during outbreaks; however, a remaining challenge is the identification of genomic markers that are predictive of strain groups that pose the most significant health threats to humans, or that can persist in specific environments. We have previously developed the software program Panseq,  ...[more]

Similar Datasets

| S-EPMC3764930 | biostudies-literature
| S-EPMC3828162 | biostudies-literature
2017-10-31 | GSE93686 | GEO
| S-EPMC5664820 | biostudies-literature
| S-EPMC5985583 | biostudies-literature
| S-EPMC2695014 | biostudies-literature
2005-08-06 | GSE2242 | GEO
| S-EPMC4959494 | biostudies-literature
| S-EPMC8538453 | biostudies-literature
2004-07-21 | GSE1500 | GEO