Dataset Information

A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking.

ABSTRACT:

Motivation

Accurately predicting the binding affinities of large sets of diverse protein-ligand complexes is an extremely challenging task. The scoring functions that attempt such computational prediction are essential for analysing the outputs of molecular docking, which in turn is an important technique for drug discovery, chemical biology and structural biology. Each scoring function assumes a predetermined theory-inspired functional form for the relationship between the variables that characterize the complex, which also include parameters fitted to experimental or simulation data and its predicted binding affinity. The inherent problem of this rigid approach is that it leads to poor predictivity for those complexes that do not conform to the modelling assumptions. Moreover, resampling strategies, such as cross-validation or bootstrapping, are still not systematically used to guard against the overfitting of calibration data in parameter estimation for scoring functions.

Results

We propose a novel scoring function (RF-Score) that circumvents the need for problematic modelling assumptions via non-parametric machine learning. In particular, Random Forest was used to implicitly capture binding effects that are hard to model explicitly. RF-Score is compared with the state of the art on the demanding PDBbind benchmark. Results show that RF-Score is a very competitive scoring function. Importantly, RF-Score's performance was shown to improve dramatically with training set size and hence the future availability of more high-quality structural and interaction data is expected to lead to improved versions of RF-Score.

Contact

pedro.ballester@ebi.ac.uk; jbom@st-andrews.ac.uk

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Ballester PJ

PROVIDER: S-EPMC3524828 | biostudies-literature | 2010 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking.

Ballester Pedro J PJ Mitchell John B O JB

Bioinformatics (Oxford, England) 20100317 9

<h4>Motivation</h4>Accurately predicting the binding affinities of large sets of diverse protein-ligand complexes is an extremely challenging task. The scoring functions that attempt such computational prediction are essential for analysing the outputs of molecular docking, which in turn is an important technique for drug discovery, chemical biology and structural biology. Each scoring function assumes a predetermined theory-inspired functional form for the relationship between the variables tha ...[more]

PMID: 20236947

Dataset Information

A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking.

Motivation

Results

Contact

Supplementary information

Publications

A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A machine learning approach towards the prediction of protein-ligand binding affinity based on fundamental molecular properties.
| S-EPMC9079328 | biostudies-literature

Predicting the impacts of mutations on protein-ligand binding affinity based on molecular dynamics simulations and machine learning methods.
| S-EPMC7052406 | biostudies-literature

3D-RISM-AI: A Machine Learning Approach to Predict Protein-Ligand Binding Affinity Using 3D-RISM.
| S-EPMC9421647 | biostudies-literature

Applied machine learning for predicting the lanthanide-ligand binding affinities.
| S-EPMC7459320 | biostudies-literature

DEELIG: A Deep Learning Approach to Predict Protein-Ligand Binding Affinity.
| S-EPMC8274096 | biostudies-literature

Predicting opioid receptor binding affinity of pharmacologically unclassified designer substances using molecular docking.
| S-EPMC5967713 | biostudies-literature

Prediction of protein-ligand binding affinity from sequencing data with interpretable machine learning.
| S-EPMC9546773 | biostudies-literature

Machine learning on ligand-residue interaction profiles to significantly improve binding affinity prediction.
| S-EPMC8425425 | biostudies-literature

Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction.
| S-EPMC8985993 | biostudies-literature

Evaluation of Docking Machine Learning and Molecular Dynamics Methodologies for DNA-Ligand Systems.
| S-EPMC8874395 | biostudies-literature