Unknown

Dataset Information

0

Modeling of the GC content of the substituted bases in bacterial core genomes.


ABSTRACT:

Background

The purpose of the present study was to examine the GC content of substituted bases (sbGC) in the core genomes of 35 bacterial species. Each species, or core genome, constituted genomes from at least 10 strains. We also wanted to explore whether sbGC for each strain was associated with the corresponding species' core genome GC content (cgGC). We present a simple mathematical model that estimates sbGC from cgGC. The model assumes only that the estimated sbGC is a function of cgGC proportional to fixed AT?GC (?) and GC???AT (?) mutation rates. Non-linear regression was used to estimate parameters ? and ? from the empirical data described above.

Results

We found that sbGC for each strain showed a non-linear association with the corresponding cgGC with a bias towards higher GC content for most core genomes (66.3% of the strains), assuming as a null-hypothesis that sbGC should be approximately equal to cgGC. The most GC rich core genomes (i.e. approximately %GC?>?60), on the other hand, exhibited slightly less GC-biased sbGC than expected. The best fitted regression model indicates that GC???AT mutation rates ? =?(1.91?±?0.13) p?ConclusionNot only did our mathematical model give reasonable estimates of sbGC it also provides further support to previous observations that mutation rates in prokaryotes exhibit a universal GC???AT bias that appears to be remarkably consistent between taxa.

SUBMITTER: Bohlin J 

PROVIDER: S-EPMC6080486 | biostudies-literature | 2018 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Modeling of the GC content of the substituted bases in bacterial core genomes.

Bohlin Jon J   Eldholm Vegard V   Brynildsrud Ola O   Petterson John H-O JH   Alfsnes Kristian K  

BMC genomics 20180806 1


<h4>Background</h4>The purpose of the present study was to examine the GC content of substituted bases (sbGC) in the core genomes of 35 bacterial species. Each species, or core genome, constituted genomes from at least 10 strains. We also wanted to explore whether sbGC for each strain was associated with the corresponding species' core genome GC content (cgGC). We present a simple mathematical model that estimates sbGC from cgGC. The model assumes only that the estimated sbGC is a function of cg  ...[more]

Similar Datasets

| S-EPMC8153448 | biostudies-literature
| S-EPMC4450053 | biostudies-literature
| S-EPMC3053387 | biostudies-literature
| S-EPMC1976330 | biostudies-literature
| S-EPMC3274465 | biostudies-literature
2015-01-23 | GSE56639 | GEO
2015-01-23 | E-GEOD-56639 | biostudies-arrayexpress
| S-EPMC4349097 | biostudies-literature
| S-EPMC3720884 | biostudies-literature
| S-EPMC2615428 | biostudies-literature