Unknown

Dataset Information

0

Large-Scale Genomics Reveals the Genetic Characteristics of Seven Species and Importance of Phylogenetic Distance for Estimating Pan-Genome Size.


ABSTRACT: For more than a decade, pan-genome analysis has been applied as an effective method for explaining the genetic contents variation of prokaryotic species. However, genomic characteristics and detailed structures of gene pools have not been fully clarified, because most studies have used a small number of genomes. Here, we constructed pan-genomes of seven species in order to elucidate variations in the genetic contents of >27,000 genomes belonging to Streptococcus pneumoniae, Staphylococcus aureus subsp. aureus, Salmonella enterica subsp. enterica, Escherichia coli and Shigella spp., Mycobacterium tuberculosis complex, Pseudomonas aeruginosa, and Acinetobacter baumannii. This work showed the pan-genomes of all seven species has open property. Additionally, systematic evaluation of the characteristics of their pan-genome revealed that phylogenetic distance provided valuable information for estimating the parameters for pan-genome size among several models including Heaps' law. Our results provide a better understanding of the species and a solution to minimize sampling biases associated with genome-sequencing preferences for pathogenic strains.

SUBMITTER: Park SC 

PROVIDER: S-EPMC6491781 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Large-Scale Genomics Reveals the Genetic Characteristics of Seven Species and Importance of Phylogenetic Distance for Estimating Pan-Genome Size.

Park Sang-Cheol SC   Lee Kihyun K   Kim Yeong Ouk YO   Won Sungho S   Chun Jongsik J  

Frontiers in microbiology 20190424


For more than a decade, pan-genome analysis has been applied as an effective method for explaining the genetic contents variation of prokaryotic species. However, genomic characteristics and detailed structures of gene pools have not been fully clarified, because most studies have used a small number of genomes. Here, we constructed pan-genomes of seven species in order to elucidate variations in the genetic contents of >27,000 genomes belonging to <i>Streptococcus pneumoniae</i>, <i>Staphylococ  ...[more]

Similar Datasets

| S-EPMC9647015 | biostudies-literature
| S-EPMC5010905 | biostudies-literature
| S-EPMC9257239 | biostudies-literature
| S-EPMC2889949 | biostudies-literature
| S-EPMC7788065 | biostudies-literature
| S-EPMC9728902 | biostudies-literature
| S-EPMC6030631 | biostudies-literature
| S-EPMC9968032 | biostudies-literature
| S-EPMC5066063 | biostudies-literature
| S-EPMC3788369 | biostudies-other