Dataset Information

Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics.

ABSTRACT:

Background

High-throughput peptide and protein identification technologies have benefited tremendously from strategies based on tandem mass spectrometry (MS/MS) in combination with database searching algorithms. A major problem with existing methods lies within the significant number of false positive and false negative annotations. So far, standard algorithms for protein identification do not use the information gained from separation processes usually involved in peptide analysis, such as retention time information, which are readily available from chromatographic separation of the sample. Identification can thus be improved by comparing measured retention times to predicted retention times. Current prediction models are derived from a set of measured test analytes but they usually require large amounts of training data.

Results

We introduce a new kernel function which can be applied in combination with support vector machines to a wide range of computational proteomics problems. We show the performance of this new approach by applying it to the prediction of peptide adsorption/elution behavior in strong anion-exchange solid-phase extraction (SAX-SPE) and ion-pair reversed-phase high-performance liquid chromatography (IP-RP-HPLC). Furthermore, the predicted retention times are used to improve spectrum identifications by a p-value-based filtering approach. The approach was tested on a number of different datasets and shows excellent performance while requiring only very small training sets (about 40 peptides instead of thousands). Using the retention time predictor in our retention time filter improves the fraction of correctly identified peptide mass spectra significantly.

Conclusion

The proposed kernel function is well-suited for the prediction of chromatographic separation in computational proteomics and requires only a limited amount of training data. The performance of this new method is demonstrated by applying it to peptide retention time prediction in IP-RP-HPLC and prediction of peptide sample fractionation in SAX-SPE. Finally, we incorporate the predicted chromatographic behavior in a p-value based filter to improve peptide identifications based on liquid chromatography-tandem mass spectrometry.

SUBMITTER: Pfeifer N

PROVIDER: S-EPMC2254445 | biostudies-literature | 2007 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics.

Pfeifer Nico N Leinenbach Andreas A Huber Christian G CG Kohlbacher Oliver O

BMC bioinformatics 20071130

<h4>Background</h4>High-throughput peptide and protein identification technologies have benefited tremendously from strategies based on tandem mass spectrometry (MS/MS) in combination with database searching algorithms. A major problem with existing methods lies within the significant number of false positive and false negative annotations. So far, standard algorithms for protein identification do not use the information gained from separation processes usually involved in peptide analysis, such ...[more]

PMID: 18053132

Dataset Information

Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics.

Background

Results

Conclusion

Publications

Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A statistical learning approach to the modeling of chromatographic retention of oligonucleotides incorporating sequence and secondary structure data.
| S-EPMC1919494 | biostudies-literature

Retention Behavior of Anticancer Thiosemicarbazides in Biomimetic Chromatographic Systems and In Silico Calculations
| S-EPMC10608985 | biostudies-literature

Selection of adequate optimization criteria in chromatographic separations.
| S-EPMC2694924 | biostudies-other

Locus-specific Retention Predictor (LsRP): A Peptide Retention Time Predictor Developed for Precision Proteomics.
| S-EPMC5356008 | biostudies-literature

Exploring the Interactions Between RHAU Peptide and G-Quadruplex Dimers Based on Chromatographic Retention Behaviors
| S-EPMC11676799 | biostudies-literature

High-performance proteomics using nano-, capillary- and micro-flow chromatographic separations
2025-09-03 | PXD062536 | Pride

Nanocapillaries for open tubular chromatographic separations of proteins in femtoliter to picoliter samples.
| S-EPMC2802834 | biostudies-literature

Estimation of low-level components lost through chromatographic separations with finite detection limits.
| S-EPMC7748966 | biostudies-literature

Amphiphilic Block Copolymer PCL-PEG-PCL as Stationary Phase for Capillary Gas Chromatographic Separations.
| S-EPMC6749289 | biostudies-literature

Retention Database for Prediction, Simulation, and Optimization of GC Separations.
| S-EPMC10249385 | biostudies-literature