Dataset Information

A semi-supervised boosting SVM for predicting hot spots at protein-protein interfaces.

ABSTRACT:

Background

Hot spots are residues contributing the most of binding free energy yet accounting for a small portion of a protein interface. Experimental approaches to identify hot spots such as alanine scanning mutagenesis are expensive and time-consuming, while computational methods are emerging as effective alternatives to experimental approaches.

Results

In this study, we propose a semi-supervised boosting SVM, which is called sbSVM, to computationally predict hot spots at protein-protein interfaces by combining protein sequence and structure features. Here, feature selection is performed using random forests to avoid over-fitting. Due to the deficiency of positive samples, our approach samples useful unlabeled data iteratively to boost the performance of hot spots prediction. The performance evaluation of our method is carried out on a dataset generated from the ASEdb database for cross-validation and a dataset from the BID database for independent test. Furthermore, a balanced dataset with similar amounts of hot spots and non-hot spots (65 and 66 respectively) derived from the first training dataset is used to further validate our method. All results show that our method yields good sensitivity, accuracy and F1 score comparing with the existing methods.

Conclusion

Our method boosts prediction performance of hot spots by using unlabeled data to overcome the deficiency of available training data. Experimental results show that our approach is more effective than the traditional supervised algorithms and major existing hot spot prediction methods.

SUBMITTER: Xu B

PROVIDER: S-EPMC3521187 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A semi-supervised boosting SVM for predicting hot spots at protein-protein interfaces.

Xu Bin B Wei Xiaoming X Deng Lei L Guan Jihong J Zhou Shuigeng S

BMC systems biology 20121212

<h4>Background</h4>Hot spots are residues contributing the most of binding free energy yet accounting for a small portion of a protein interface. Experimental approaches to identify hot spots such as alanine scanning mutagenesis are expensive and time-consuming, while computational methods are emerging as effective alternatives to experimental approaches.<h4>Results</h4>In this study, we propose a semi-supervised boosting SVM, which is called sbSVM, to computationally predict hot spots at protei ...[more]

PMID: 23282146

Dataset Information

A semi-supervised boosting SVM for predicting hot spots at protein-protein interfaces.

Background

Results

Conclusion

Publications

A semi-supervised boosting SVM for predicting hot spots at protein-protein interfaces.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Prediction of hot spots in protein-DNA binding interfaces based on supervised isometric feature mapping and extreme gradient boosting.
| S-EPMC7495874 | biostudies-literature

Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting.
| S-EPMC6155324 | biostudies-literature

XGBPRH: Prediction of Binding Hot Spots at Protein⁻RNA Interfaces Utilizing Extreme Gradient Boosting.
| S-EPMC6471955 | biostudies-literature

Structural conservation of druggable hot spots in protein-protein interfaces.
| S-EPMC3158149 | biostudies-literature

HotSprint: database of computational hot spots in protein interfaces.
| S-EPMC2238999 | biostudies-literature

Relationship between hot spot residues and ligand binding hot spots in protein-protein interfaces.
| S-EPMC3623692 | biostudies-literature

Predicting hot spots in protein interfaces based on protrusion index, pseudo hydrophobicity and electron-ion interaction pseudopotential features.
| S-EPMC4951271 | biostudies-literature

Boosting prediction performance of protein-protein interaction hot spots by using structural neighborhood properties.
| S-EPMC3822376 | biostudies-literature

PCRPi-DB: a database of computationally annotated hot spots in protein interfaces.
| S-EPMC3013674 | biostudies-literature

PCRPi: Presaging Critical Residues in Protein interfaces, a new computational tool to chart hot spots in protein interfaces.
| S-EPMC2847225 | biostudies-literature