Unknown

Dataset Information

0

Predicting the evolution of Escherichia coli by a data-driven approach.


ABSTRACT: A tantalizing question in evolutionary biology is whether evolution can be predicted from past experiences. To address this question, we created a coherent compendium of more than 15,000 mutation events for the bacterium Escherichia coli under 178 distinct environmental settings. Compendium analysis provides a comprehensive view of the explored environments, mutation hotspots and mutation co-occurrence. While the mutations shared across all replicates decrease with the number of replicates, our results argue that the pairwise overlapping ratio remains the same, regardless of the number of replicates. An ensemble of predictors trained on the mutation compendium and tested in forward validation over 35 evolution replicates achieves a 49.2?±?5.8% (mean?±?std) precision and 34.5?±?5.7% recall in predicting mutation targets. This work demonstrates how integrated datasets can be harnessed to create predictive models of evolution at a gene level and elucidate the effect of evolutionary processes in well-defined environments.

SUBMITTER: Wang X 

PROVIDER: S-EPMC6120903 | biostudies-other | 2018 Sep

REPOSITORIES: biostudies-other

altmetric image

Publications

Predicting the evolution of Escherichia coli by a data-driven approach.

Wang Xiaokang X   Zorraquino Violeta V   Kim Minseung M   Tsoukalas Athanasios A   Tagkopoulos Ilias I  

Nature communications 20180903 1


A tantalizing question in evolutionary biology is whether evolution can be predicted from past experiences. To address this question, we created a coherent compendium of more than 15,000 mutation events for the bacterium Escherichia coli under 178 distinct environmental settings. Compendium analysis provides a comprehensive view of the explored environments, mutation hotspots and mutation co-occurrence. While the mutations shared across all replicates decrease with the number of replicates, our  ...[more]

Similar Datasets

| S-EPMC7509812 | biostudies-literature
| S-EPMC5074862 | biostudies-literature
2011-10-22 | GSE33147 | GEO
| S-EPMC6829643 | biostudies-literature
| S-EPMC8088855 | biostudies-literature
| S-EPMC7193553 | biostudies-literature
| S-EPMC3772739 | biostudies-literature
| S-EPMC6022653 | biostudies-literature
2011-10-21 | E-GEOD-33147 | biostudies-arrayexpress
2006-07-07 | GSE5239 | GEO