Unknown

Dataset Information

0

Improving peak detection in high-resolution LC/MS metabolomics data using preexisting knowledge and machine learning approach.


ABSTRACT: Peak detection is a key step in the preprocessing of untargeted metabolomics data generated from high-resolution liquid chromatography-mass spectrometry (LC/MS). The common practice is to use filters with predetermined parameters to select peaks in the LC/MS profile. This rigid approach can cause suboptimal performance when the choice of peak model and parameters do not suit the data characteristics.Here we present a method that learns directly from various data features of the extracted ion chromatograms (EICs) to differentiate between true peak regions from noise regions in the LC/MS profile. It utilizes the knowledge of known metabolites, as well as robust machine learning approaches. Unlike currently available methods, this new approach does not assume a parametric peak shape model and allows maximum flexibility. We demonstrate the superiority of the new approach using real data. Because matching to known metabolites entails uncertainties and cannot be considered a gold standard, we also developed a probabilistic receiver-operating characteristic (pROC) approach that can incorporate uncertainties.The new peak detection approach is implemented as part of the apLCMS package available at http://web1.sph.emory.edu/apLCMS/ CONTACT: tyu8@emory.eduSupplementary data are available at Bioinformatics online.

SUBMITTER: Yu T 

PROVIDER: S-EPMC4184266 | biostudies-literature | 2014 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improving peak detection in high-resolution LC/MS metabolomics data using preexisting knowledge and machine learning approach.

Yu Tianwei T   Jones Dean P DP  

Bioinformatics (Oxford, England) 20140707 20


<h4>Motivation</h4>Peak detection is a key step in the preprocessing of untargeted metabolomics data generated from high-resolution liquid chromatography-mass spectrometry (LC/MS). The common practice is to use filters with predetermined parameters to select peaks in the LC/MS profile. This rigid approach can cause suboptimal performance when the choice of peak model and parameters do not suit the data characteristics.<h4>Results</h4>Here we present a method that learns directly from various dat  ...[more]

Similar Datasets

| S-EPMC8969107 | biostudies-literature
| S-EPMC7895495 | biostudies-literature
| S-EPMC8642397 | biostudies-literature
| S-EPMC6501219 | biostudies-literature
| S-EPMC3982975 | biostudies-literature
| S-EPMC3624888 | biostudies-literature
| S-EPMC3637833 | biostudies-other
| S-EPMC8125400 | biostudies-literature
| S-EPMC3792097 | biostudies-literature
| S-EPMC8878835 | biostudies-literature