Unknown

Dataset Information

0

A learned embedding for efficient joint analysis of millions of mass spectra.


ABSTRACT: Computational methods that aim to exploit publicly available mass spectrometry repositories rely primarily on unsupervised clustering of spectra. Here we trained a deep neural network in a supervised fashion on the basis of previous assignments of peptides to spectra. The network, called 'GLEAMS', learns to embed spectra in a low-dimensional space in which spectra generated by the same peptide are close to one another. We applied GLEAMS for large-scale spectrum clustering, detecting groups of unidentified, proximal spectra representing the same peptide. We used these clusters to explore the dark proteome of repeatedly observed yet consistently unidentified mass spectra.

SUBMITTER: Bittremieux W 

PROVIDER: S-EPMC9189069 | biostudies-literature | 2022 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

A learned embedding for efficient joint analysis of millions of mass spectra.

Bittremieux Wout W   May Damon H DH   Bilmes Jeffrey J   Noble William Stafford WS  

Nature methods 20220530 6


Computational methods that aim to exploit publicly available mass spectrometry repositories rely primarily on unsupervised clustering of spectra. Here we trained a deep neural network in a supervised fashion on the basis of previous assignments of peptides to spectra. The network, called 'GLEAMS', learns to embed spectra in a low-dimensional space in which spectra generated by the same peptide are close to one another. We applied GLEAMS for large-scale spectrum clustering, detecting groups of un  ...[more]

Similar Datasets

| S-EPMC2533155 | biostudies-literature
| S-EPMC7419517 | biostudies-literature
| S-EPMC9229217 | biostudies-literature
| S-EPMC4968634 | biostudies-literature
| S-EPMC4593959 | biostudies-literature
| S-EPMC2648731 | biostudies-literature
| S-EPMC6465123 | biostudies-literature
| S-EPMC11340721 | biostudies-literature
| S-EPMC5647224 | biostudies-literature
| S-EPMC3855366 | biostudies-literature