Unknown

Dataset Information

0

Deep Neural Networks for Classification of LC-MS Spectral Peaks.


ABSTRACT: Liquid chromatography-mass spectrometry (LC-MS)-based metabolomics has emerged as a valuable tool for biological discovery, capable of assaying thousands of diverse chemical entities in a single biospecimen. Processing of nontargeted LC-MS spectral data requires identification and isolation of true spectral features from the random, false noise peaks that comprise a significant portion of total signals, using inexact peak selection algorithms and time-consuming visual inspection of data. To increase the fidelity and speed of data processing, herein we establish, optimize, and evaluate a machine learning pipeline employing deep neural networks as well as a simpler multiple logistic regression model for classification of spectral features from nontargeted LC-MS metabolomics data. Machine learning-based approaches were found to remove up to 90% of false peaks from complex nontargeted LC-MS data sets without reducing true positive signals and exhibit excellent reproducibility across multiple data sets. Application of machine learning for nontargeted LC-MS-based peak selection provides for robust and scalable peak classification and data filtering, enabling handling and processing of large scale, complex metabolomics data sets.

SUBMITTER: Kantz ED 

PROVIDER: S-EPMC7089603 | biostudies-literature | 2019 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Deep Neural Networks for Classification of LC-MS Spectral Peaks.

Kantz Edward D ED   Tiwari Saumya S   Watrous Jeramie D JD   Cheng Susan S   Jain Mohit M  

Analytical chemistry 20190919 19


Liquid chromatography-mass spectrometry (LC-MS)-based metabolomics has emerged as a valuable tool for biological discovery, capable of assaying thousands of diverse chemical entities in a single biospecimen. Processing of nontargeted LC-MS spectral data requires identification and isolation of true spectral features from the random, false noise peaks that comprise a significant portion of total signals, using inexact peak selection algorithms and time-consuming visual inspection of data. To incr  ...[more]

Similar Datasets

2019-08-08 | MSV000084186 | MassIVE
2019-08-08 | MSV000084186 | GNPS
| S-EPMC6010233 | biostudies-other
| S-EPMC7878786 | biostudies-literature
| S-EPMC7407934 | biostudies-literature
| S-EPMC6662992 | biostudies-literature
| S-EPMC9483455 | biostudies-literature
| S-EPMC8248543 | biostudies-literature
| S-EPMC11355344 | biostudies-literature
| S-EPMC6929458 | biostudies-literature