Dataset Information

Identification of metabolic network models from incomplete high-throughput datasets.

ABSTRACT:

Motivation

High-throughput measurement techniques for metabolism and gene expression provide a wealth of information for the identification of metabolic network models. Yet, missing observations scattered over the dataset restrict the number of effectively available datapoints and make classical regression techniques inaccurate or inapplicable. Thorough exploitation of the data by identification techniques that explicitly cope with missing observations is therefore of major importance.

Results

We develop a maximum-likelihood approach for the estimation of unknown parameters of metabolic network models that relies on the integration of statistical priors to compensate for the missing data. In the context of the linlog metabolic modeling framework, we implement the identification method by an Expectation-Maximization (EM) algorithm and by a simpler direct numerical optimization method. We evaluate performance of our methods by comparison to existing approaches, and show that our EM method provides the best results over a variety of simulated scenarios. We then apply the EM algorithm to a real problem, the identification of a model for the Escherichia coli central carbon metabolism, based on challenging experimental data from the literature. This leads to promising results and allows us to highlight critical identification issues.

SUBMITTER: Berthoumieux S

PROVIDER: S-EPMC3117355 | biostudies-literature | 2011 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Identification of metabolic network models from incomplete high-throughput datasets.

Berthoumieux Sara S Brilli Matteo M de Jong Hidde H Kahn Daniel D Cinquemani Eugenio E

Bioinformatics (Oxford, England) 20110701 13

<h4>Motivation</h4>High-throughput measurement techniques for metabolism and gene expression provide a wealth of information for the identification of metabolic network models. Yet, missing observations scattered over the dataset restrict the number of effectively available datapoints and make classical regression techniques inaccurate or inapplicable. Thorough exploitation of the data by identification techniques that explicitly cope with missing observations is therefore of major importance.<h ...[more]

PMID: 21685069

Dataset Information

Identification of metabolic network models from incomplete high-throughput datasets.

Motivation

Results

Publications

Identification of metabolic network models from incomplete high-throughput datasets.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

High Throughput Identification of Antihypertensive Peptides from Fish Proteome Datasets.
| S-EPMC6212880 | biostudies-literature

Rapid identification of non-human sequences in high-throughput sequencing datasets.
| S-EPMC3324519 | biostudies-literature

SAMNet: a network-based approach to integrate multi-dimensional high throughput datasets.
| S-EPMC3501250 | biostudies-literature

Microalgal Metabolic Network Model Refinement through High-Throughput Functional Metabolic Profiling.
| S-EPMC4261833 | biostudies-literature

Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package.
| S-EPMC4838263 | biostudies-literature

Estimating the success of re-identifications in incomplete datasets using generative models.
| S-EPMC6650473 | biostudies-literature

Reconciling high-throughput gene essentiality data with metabolic network reconstructions.
| S-EPMC6478342 | biostudies-literature

Chemical Graph-Based Transformer Models for Yield Prediction of High-Throughput Cross-Coupling Reaction Datasets.
| S-EPMC11447720 | biostudies-literature

Predictive models for anti-tubercular molecules using machine learning on high-throughput biological screening datasets.
| S-EPMC3228709 | biostudies-literature