Unknown

Dataset Information

0

XGBPRH: Prediction of Binding Hot Spots at Protein⁻RNA Interfaces Utilizing Extreme Gradient Boosting.


ABSTRACT: Hot spot residues at protein⁻RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein⁻RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various available features and a series of machine learning approaches, these methods are still in the infant stage. In this paper, we present a new computational method named XGBPRH, which is based on an eXtreme Gradient Boosting (XGBoost) algorithm and can effectively predict hot spot residues in protein⁻RNA interfaces utilizing an optimal set of properties. Firstly, we download 47 protein⁻RNA complexes and calculate a total of 156 sequence, structure, exposure, and network features. Next, we adopt a two-step feature selection algorithm to extract a combination of 6 optimal features from the combination of these 156 features. Compared with the state-of-the-art approaches, XGBPRH achieves better performances with an area under the ROC curve (AUC) score of 0.817 and an F1-score of 0.802 on the independent test set. Meanwhile, we also apply XGBPRH to two case studies. The results demonstrate that the method can effectively identify novel energy hotspots.

SUBMITTER: Deng L 

PROVIDER: S-EPMC6471955 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

XGBPRH: Prediction of Binding Hot Spots at Protein⁻RNA Interfaces Utilizing Extreme Gradient Boosting.

Deng Lei L   Sui Yuanchao Y   Zhang Jingpu J  

Genes 20190321 3


Hot spot residues at protein⁻RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein⁻RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various available features and a series of machine learning approaches, these methods are still in the infant stage. In this paper, we present a new computational method named XGBPRH, which is based on an eXtre  ...[more]

Similar Datasets

| S-EPMC6155324 | biostudies-literature
| S-EPMC7495874 | biostudies-literature
| S-EPMC3521187 | biostudies-literature
| S-EPMC8687429 | biostudies-literature
| S-EPMC5849212 | biostudies-literature
| S-EPMC3623692 | biostudies-literature
| S-EPMC8837382 | biostudies-literature
| S-EPMC10074722 | biostudies-literature
| S-EPMC7724862 | biostudies-literature
| S-EPMC3822376 | biostudies-literature