Proteomics

Dataset Information

0

Extremely fast and accurate open modification spectral library searching of high-resolution mass spectra using feature hashing and graphics processing units


ABSTRACT: Open modification searching (OMS) is a powerful search strategy to identify peptides with any type of modification. OMS works by using a very wide precursor mass window to allow modified spectra to match against their unmodified variants, after which the modification types can be inferred from the corresponding precursor mass differences. A disadvantage of this strategy, however, are the large computational requirements, as each query spectrum has to be compared against a multitude of candidate peptides. We have previously introduced the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. Here we demonstrate how this candidate selection procedure can be further optimized using graphics processing units. Additionally, we introduce a feature hashing scheme to convert high-resolution spectra to low-dimensional vectors. Based on these algorithmic advances, along with low-level code optimizations, the new version of ANN-SoLo is up to an order of magnitude faster than its initial version. This makes it possible to efficiently perform open searches on a large scale to gain a deeper understanding about the protein modification landscape. We demonstrate the computational efficiency and identification performance of ANN-SoLo based on a large data set of the draft human proteome.

INSTRUMENT(S): TripleTOF 5600

ORGANISM(S): Homo Sapiens (human) Saccharomyces Cerevisiae (baker's Yeast)

TISSUE(S): Cell Culture

SUBMITTER: Wout Bittremieux  

LAB HEAD: Kris Laukens

PROVIDER: PXD013641 | Pride | 2019-12-06

REPOSITORIES: Pride

Dataset's files

Source:
Action DRS
bf_oms_shifted.mztab Mztab
bf_std.mztab Mztab
human_yeast_targetdecoy.splib Other
iPRG2012.mgf Mgf
iPRG_2012_wiff.zip Other
Items per page:
1 - 5 of 36
altmetric image

Publications

Extremely Fast and Accurate Open Modification Spectral Library Searching of High-Resolution Mass Spectra Using Feature Hashing and Graphics Processing Units.

Bittremieux Wout W   Laukens Kris K   Noble William Stafford WS  

Journal of proteome research 20190830 10


Open modification searching (OMS) is a powerful search strategy to identify peptides with any type of modification. OMS works by using a very wide precursor mass window to allow modified spectra to match against their unmodified variants, after which the modification types can be inferred from the corresponding precursor mass differences. A disadvantage of this strategy, however, is the large computational cost, because each query spectrum has to be compared against a multitude of candidate pept  ...[more]

Similar Datasets

2021-05-25 | PXD009861 | Pride
2021-12-21 | MSV000088598 | MassIVE
2021-12-21 | MSV000088598 | GNPS
2012-11-02 | E-GEOD-33466 | biostudies-arrayexpress
2016-08-04 | E-GEOD-83315 | biostudies-arrayexpress
2016-08-17 | PXD002803 | Pride
2018-10-04 | GSE112623 | GEO
2018-07-11 | PXD008782 | Pride
2018-07-11 | PXD008783 | Pride
2012-09-28 | E-GEOD-40684 | biostudies-arrayexpress