Unknown

Dataset Information

0

A Fast and Memory-Efficient Spectral Library Search Algorithm Using Locality-Sensitive Hashing.


ABSTRACT: With the accumulation of MS/MS spectra collected in spectral libraries, the spectral library searching approach emerges as an important approach for peptide identification in proteomics, complementary to the commonly used protein database searching approach, in particular for the proteomic analyses of well-studied model organisms, such as human. Existing spectral library searching algorithms compare a query MS/MS spectrum with each spectrum in the library with matched precursor mass and charge state, which may become computationally intensive with the rapidly growing library size. Here, the software msSLASH, which implements a fast spectral library searching algorithm based on the Locality-Sensitive Hashing (LSH) technique, is presented. The algorithm first converts the library and query spectra into bit-strings using LSH functions, and then computes the similarity between the spectra with highly similar bit-string. Using the spectral library searching of large real-world MS/MS spectra datasets, it is demonstrated that the algorithm significantly reduced the number of spectral comparisons, and as a result, achieved 2-9X speedup in comparison with existing spectral library searching algorithm SpectraST. The spectral searching algorithm is implemented in C/C++, and is ready to be used in proteomic data analyses.

SUBMITTER: Wang L 

PROVIDER: S-EPMC7669687 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4393915 | biostudies-other
| S-EPMC8340999 | biostudies-literature
| S-EPMC6612865 | biostudies-other
| S-EPMC9301846 | biostudies-literature
| S-EPMC10538361 | biostudies-literature
| S-EPMC10985673 | biostudies-literature
| S-EPMC9587027 | biostudies-literature
| S-EPMC10313348 | biostudies-literature
| S-EPMC6886738 | biostudies-literature
| S-EPMC5773183 | biostudies-literature