Dataset Information

Calibrating E-values for MS2 database search methods.

ABSTRACT: The key to mass-spectrometry-based proteomics is peptide identification, which relies on software analysis of tandem mass spectra. Although each search engine has its strength, combining the strengths of various search engines is not yet realizable largely due to the lack of a unified statistical framework that is applicable to any method.We have developed a universal scheme for statistical calibration of peptide identifications. The protocol can be used for both de novo approaches as well as database search methods. We demonstrate the protocol using only the database search methods. Among seven methods -SEQUEST (v27 rev12), ProbID (v1.0), InsPecT (v20060505), Mascot (v2.1), X!Tandem (v1.0), OMSSA (v2.0) and RAId_DbS - calibrated, except for X!Tandem and RAId_DbS most methods require a rescaling according to the database size searched. We demonstrate that our calibration protocol indeed produces unified statistics both in terms of average number of false positives and in terms of the probability for a peptide hit to be a true positive. Although both the protocols for calibration and the statistics thus calibrated are universal, the calibration formulas obtained from one laboratory with data collected using either centroid or profile format may not be directly usable by the other laboratories. Thus each laboratory is encouraged to calibrate the search methods it intends to use. We also address the importance of using spectrum-specific statistics and possible improvement on the current calibration protocol. The spectra used for statistical (E-value) calibration are freely available upon request.

SUBMITTER: Alves G

PROVIDER: S-EPMC2206012 | biostudies-literature | 2007 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Calibrating E-values for MS2 database search methods.

Alves Gelio G Ogurtsov Aleksey Y AY Wu Wells W WW Wang Guanghui G Shen Rong-Fong RF Yu Yi-Kuo YK

Biology direct 20071105

<h4>Background</h4>The key to mass-spectrometry-based proteomics is peptide identification, which relies on software analysis of tandem mass spectra. Although each search engine has its strength, combining the strengths of various search engines is not yet realizable largely due to the lack of a unified statistical framework that is applicable to any method.<h4>Results</h4>We have developed a universal scheme for statistical calibration of peptide identifications. The protocol can be used for bo ...[more]

PMID: 17983478

Dataset Information

Calibrating E-values for MS2 database search methods.

Publications

Calibrating E-values for MS2 database search methods.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A hierarchical MS2/MS3 database search algorithm for automated analysis of phosphopeptide tandem mass spectra.
| S-EPMC2775432 | biostudies-literature

Detection of co-eluted peptides using database search methods.
| S-EPMC2483259 | biostudies-literature

Calibrating variant-scoring methods for clinical decision making.
| S-EPMC8023678 | biostudies-literature

Bayesian Methods for Calibrating Health Policy Models: A Tutorial.
| S-EPMC5448142 | biostudies-literature

Global Identification of Protein PTMs in a Single-pass Database Search
2015-07-31 | GSE59956 | GEO

Global Identification of Protein PTMs in a Single-pass Database Search
2015-07-31 | E-GEOD-59956 | biostudies-arrayexpress

Power spectrum and Allan variance methods for calibrating single-molecule video-tracking instruments.
| S-EPMC3306435 | biostudies-other

Efficient Database Search via Tensor Distribution Bucketing
| S-EPMC7206332 | biostudies-literature

WGDB: Wood Gene Database with search interface.
| S-EPMC3916818 | biostudies-literature

ProteoStorm: An Ultrafast Metaproteomics Database Search Framework.
| S-EPMC6231400 | biostudies-literature