Metabolomics

Dataset Information

0

Optimization of Imputation Strategies for High-Resolution Gas Chromatography-Mass Spectrometry (HR GC-MS) Metabolomics Data


ABSTRACT: Gas chromatography-coupled mass spectrometry (GC-MS) has been used in biomedical research to analyze volatile, non-polar, and polar metabolites in a wide array of sample types. Despite advances in technology, missing values are still common in metabolomics datasets and must be properly handled. We evaluated the performance of ten commonly used missing value imputa-tion methods with metabolites analyzed on an HR GC-MS instrument. By introducing missing values into the complete (i.e., data without any missing values) NIST plasma dataset we demon-strate that Random Forest (RF), Glmnet Ridge Regression (GRR), and Bayesian Principal Com-ponent Analysis (BPCA) shared the lowest Root Mean Squared Error (RMSE) in technical repli-cate data. Further examination of these three methods in data from baboon plasma and liver samples demonstrated they all maintained high accuracy. Overall, our analysis suggests that any of the three imputation methods can be applied effectively to untargeted metabolomics datasets with high accuracy. However, it is important to note that imputation will alter the correlation structure of the dataset, and bias downstream regression coefficients and p-values.

ORGANISM(S): Papio Hamadryas Baboon

TISSUE(S): Liver, Blood

SUBMITTER: Isaac Ampong  

PROVIDER: ST002132 | MetabolomicsWorkbench | Fri Apr 01 00:00:00 BST 2022

REPOSITORIES: MetabolomicsWorkbench

Dataset's files

Source:
Action DRS
mwtab Other
Items per page:
1 - 1 of 1

Similar Datasets

2010-05-19 | E-GEOD-15370 | biostudies-arrayexpress
2021-05-07 | PXD023012 | Pride
2021-05-07 | PXD022996 | Pride
2021-05-07 | PXD023040 | Pride
2009-11-24 | GSE15370 | GEO
2024-07-23 | MODEL2407230001 | BioModels
2023-01-24 | PXD025439 | Pride
2023-11-21 | PXD041421 | Pride
2023-11-21 | PXD041391 | Pride
2022-05-23 | PXD027467 | Pride