Dataset Information

Improving scoring-docking-screening powers of protein-ligand scoring functions using random forest.

ABSTRACT: The development of new protein-ligand scoring functions using machine learning algorithms, such as random forest, has been of significant interest. By efficiently utilizing expanded feature sets and a large set of experimental data, random forest based scoring functions (RFbScore) can achieve better correlations to experimental protein-ligand binding data with known crystal structures; however, more extensive tests indicate that such enhancement in scoring power comes with significant under-performance in docking and screening power tests compared to traditional scoring functions. In this work, to improve scoring-docking-screening powers of protein-ligand docking functions simultaneously, we have introduced a ?vina RF parameterization and feature selection framework based on random forest. Our developed scoring function ?vina RF20 , which employs 20 descriptors in addition to the AutoDock Vina score, can achieve superior performance in all power tests of both CASF-2013 and CASF-2007 benchmarks compared to classical scoring functions. The ?vina RF20 scoring function and its code are freely available on the web at: https://www.nyu.edu/projects/yzhang/DeltaVina. © 2016 Wiley Periodicals, Inc.

SUBMITTER: Wang C

PROVIDER: S-EPMC5140681 | biostudies-literature | 2017 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Improving scoring-docking-screening powers of protein-ligand scoring functions using random forest.

Wang Cheng C Zhang Yingkai Y

Journal of computational chemistry 20161117 3

The development of new protein-ligand scoring functions using machine learning algorithms, such as random forest, has been of significant interest. By efficiently utilizing expanded feature sets and a large set of experimental data, random forest based scoring functions (RFbScore) can achieve better correlations to experimental protein-ligand binding data with known crystal structures; however, more extensive tests indicate that such enhancement in scoring power comes with significant under-perf ...[more]

PMID: 27859414

Dataset Information

Improving scoring-docking-screening powers of protein-ligand scoring functions using random forest.

Publications

Improving scoring-docking-screening powers of protein-ligand scoring functions using random forest.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A generalized protein-ligand scoring framework with balanced scoring, docking, ranking and screening powers.
| S-EPMC10395315 | biostudies-literature

Improving protein-ligand docking and screening accuracies by incorporating a scoring function correction term.
| S-EPMC9116214 | biostudies-literature

Improving Docking Power for Short Peptides Using Random Forest.
| S-EPMC8543977 | biostudies-literature

Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions.
| S-EPMC9197983 | biostudies-literature

A random forest classifier for protein-protein docking models.
| S-EPMC9710594 | biostudies-literature

Aromatic interactions at the ligand-protein interface: Implications for the development of docking scoring functions.
| S-EPMC5818208 | biostudies-literature

AGL-Score: Algebraic Graph Learning Score for Protein-Ligand Binding Scoring, Ranking, Docking, and Screening.
| S-EPMC6664294 | biostudies-literature

Random forest classifier improving phenylketonuria screening performance in two Chinese populations.
| S-EPMC9592754 | biostudies-literature

An interaction-motif-based scoring function for protein-ligand docking.
| S-EPMC3098071 | biostudies-literature

Lin_F9: A Linear Empirical Scoring Function for Protein-Ligand Docking.
| S-EPMC8478859 | biostudies-literature