Unknown

Dataset Information

0

Accelerated knowledge discovery from omics data by optimal experimental design.


ABSTRACT: How to design experiments that accelerate knowledge discovery on complex biological landscapes remains a tantalizing question. We present an optimal experimental design method (coined OPEX) to identify informative omics experiments using machine learning models for both experimental space exploration and model training. OPEX-guided exploration of Escherichia coli's populations exposed to biocide and antibiotic combinations lead to more accurate predictive models of gene expression with 44% less data. Analysis of the proposed experiments shows that broad exploration of the experimental space followed by fine-tuning emerges as the optimal strategy. Additionally, analysis of the experimental data reveals 29 cases of cross-stress protection and 4 cases of cross-stress vulnerability. Further validation reveals the central role of chaperones, stress response proteins and transport pumps in cross-stress exposure. This work demonstrates how active learning can be used to guide omics data collection for training predictive models, making evidence-driven decisions and accelerating knowledge discovery in life sciences.

SUBMITTER: Wang X 

PROVIDER: S-EPMC7538421 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accelerated knowledge discovery from omics data by optimal experimental design.

Wang Xiaokang X   Rai Navneet N   Merchel Piovesan Pereira Beatriz B   Eetemadi Ameen A   Tagkopoulos Ilias I  

Nature communications 20201006 1


How to design experiments that accelerate knowledge discovery on complex biological landscapes remains a tantalizing question. We present an optimal experimental design method (coined OPEX) to identify informative omics experiments using machine learning models for both experimental space exploration and model training. OPEX-guided exploration of Escherichia coli's populations exposed to biocide and antibiotic combinations lead to more accurate predictive models of gene expression with 44% less  ...[more]

Similar Datasets

2020-07-09 | GSE144604 | GEO
| PRJNA604190 | ENA
| S-EPMC3974819 | biostudies-other
| S-EPMC8760642 | biostudies-literature
| S-EPMC10701104 | biostudies-literature
| S-EPMC7879761 | biostudies-literature
| S-EPMC5831141 | biostudies-literature
| S-EPMC9055065 | biostudies-literature
| S-EPMC8906444 | biostudies-literature
| S-EPMC9721483 | biostudies-literature