Unknown

Dataset Information

0

RT-SVR+q: a strategy for post-Mascot analysis using retention time and q value metric to improve peptide and protein identifications.


ABSTRACT: Shotgun proteomics commonly utilizes database search like Mascot to identify proteins from tandem MS/MS spectra. False discovery rate (FDR) is often used to assess the confidence of peptide identifications. However, a widely accepted FDR of 1% sacrifices the sensitivity of peptide identification while improving the accuracy. This article details a machine learning approach combining retention time based support vector regressor (RT-SVR) with q value based statistical analysis to improve peptide and protein identifications with high sensitivity and accuracy. The use of confident peptide identifications as training examples and careful feature selection ensures high R values (>0.900) for all models. The application of RT-SVR model on Mascot results (p=0.10) increases the sensitivity of peptide identifications. q Value, as a function of deviation between predicted and experimental RTs (?RT), is used to assess the significance of peptide identifications. We demonstrate that the peptide and protein identifications increase by up to 89.4% and 83.5%, respectively, for a specified q value of 0.01 when applying the method to proteomic analysis of the natural killer leukemia cell line (NKL). This study establishes an effective methodology and provides a platform for profiling confident proteomes in more relevant species as well as a future investigation of accurate protein quantification.

SUBMITTER: Cao W 

PROVIDER: S-EPMC3225640 | biostudies-literature | 2011 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

RT-SVR+q: a strategy for post-Mascot analysis using retention time and q value metric to improve peptide and protein identifications.

Cao Weifeng W   Ma Di D   Kapur Arvinder A   Patankar Manish S MS   Ma Yadi Y   Li Lingjun L  

Journal of proteomics 20110824 2


Shotgun proteomics commonly utilizes database search like Mascot to identify proteins from tandem MS/MS spectra. False discovery rate (FDR) is often used to assess the confidence of peptide identifications. However, a widely accepted FDR of 1% sacrifices the sensitivity of peptide identification while improving the accuracy. This article details a machine learning approach combining retention time based support vector regressor (RT-SVR) with q value based statistical analysis to improve peptide  ...[more]

Similar Datasets

| S-EPMC2720604 | biostudies-other
| S-EPMC6079931 | biostudies-literature
| S-EPMC8163845 | biostudies-literature
| S-EPMC7473422 | biostudies-literature
| S-EPMC5939896 | biostudies-literature
| S-EPMC3076744 | biostudies-literature
| S-EPMC5685437 | biostudies-literature
| S-EPMC7442733 | biostudies-literature
| S-EPMC6119199 | biostudies-other
| S-EPMC6642524 | biostudies-literature