Unknown

Dataset Information

0

Multi-omics integration accurately predicts cellular state in unexplored conditions for Escherichia coli.


ABSTRACT: A significant obstacle in training predictive cell models is the lack of integrated data sources. We develop semi-supervised normalization pipelines and perform experimental characterization (growth, transcriptional, proteome) to create Ecomics, a consistent, quality-controlled multi-omics compendium for Escherichia coli with cohesive meta-data information. We then use this resource to train a multi-scale model that integrates four omics layers to predict genome-wide concentrations and growth dynamics. The genetic and environmental ontology reconstructed from the omics data is substantially different and complementary to the genetic and chemical ontologies. The integration of different layers confers an incremental increase in the prediction performance, as does the information about the known gene regulatory and protein-protein interactions. The predictive performance of the model ranges from 0.54 to 0.87 for the various omics layers, which far exceeds various baselines. This work provides an integrative framework of omics-driven predictive modelling that is broadly applicable to guide biological discovery.

SUBMITTER: Kim M 

PROVIDER: S-EPMC5059772 | biostudies-literature | 2016 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Multi-omics integration accurately predicts cellular state in unexplored conditions for Escherichia coli.

Kim Minseung M   Rai Navneet N   Zorraquino Violeta V   Tagkopoulos Ilias I  

Nature communications 20161007


A significant obstacle in training predictive cell models is the lack of integrated data sources. We develop semi-supervised normalization pipelines and perform experimental characterization (growth, transcriptional, proteome) to create Ecomics, a consistent, quality-controlled multi-omics compendium for Escherichia coli with cohesive meta-data information. We then use this resource to train a multi-scale model that integrates four omics layers to predict genome-wide concentrations and growth dy  ...[more]

Similar Datasets

| S-EPMC4121545 | biostudies-literature
| S-EPMC2758844 | biostudies-literature
| S-EPMC6879803 | biostudies-literature
| S-EPMC3416263 | biostudies-literature
| S-EPMC6158460 | biostudies-literature
2012-06-01 | E-MTAB-984 | biostudies-arrayexpress
| S-EPMC9259796 | biostudies-literature
| S-EPMC6050171 | biostudies-literature
2007-12-06 | GSE9755 | GEO
2013-11-04 | GSE49296 | GEO