Dataset Information

A machine learning-based method to improve docking scoring functions and its application to drug repurposing.

ABSTRACT: Docking scoring functions are notoriously weak predictors of binding affinity. They typically assign a common set of weights to the individual energy terms that contribute to the overall energy score; however, these weights should be gene family dependent. In addition, they incorrectly assume that individual interactions contribute toward the total binding affinity in an additive manner. In reality, noncovalent interactions often depend on one another in a nonlinear manner. In this paper, we show how the use of support vector machines (SVMs), trained by associating sets of individual energy terms retrieved from molecular docking with the known binding affinity of each compound from high-throughput screening experiments, can be used to improve the correlation between known binding affinities and those predicted by the docking program eHiTS. We construct two prediction models: a regression model trained using IC(50) values from BindingDB, and a classification model trained using active and decoy compounds from the Directory of Useful Decoys (DUD). Moreover, to address the issue of overrepresentation of negative data in high-throughput screening data sets, we have designed a multiple-planar SVM training procedure for the classification model. The increased performance that both SVMs give when compared with the original eHiTS scoring function highlights the potential for using nonlinear methods when deriving overall energy scores from their individual components. We apply the above methodology to train a new scoring function for direct inhibitors of Mycobacterium tuberculosis (M.tb) InhA. By combining ligand binding site comparison with the new scoring function, we propose that phosphodiesterase inhibitors can potentially be repurposed to target M.tb InhA. Our methodology may be applied to other gene families for which target structures and activity data are available, as demonstrated in the work presented here.

SUBMITTER: Kinnings SL

PROVIDER: S-EPMC3076728 | biostudies-literature | 2011 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A machine learning-based method to improve docking scoring functions and its application to drug repurposing.

Kinnings Sarah L SL Liu Nina N Tonge Peter J PJ Jackson Richard M RM Xie Lei L Bourne Philip E PE

Journal of chemical information and modeling 20110203 2

Docking scoring functions are notoriously weak predictors of binding affinity. They typically assign a common set of weights to the individual energy terms that contribute to the overall energy score; however, these weights should be gene family dependent. In addition, they incorrectly assume that individual interactions contribute toward the total binding affinity in an additive manner. In reality, noncovalent interactions often depend on one another in a nonlinear manner. In this paper, we sho ...[more]

PMID: 21291174

Dataset Information

A machine learning-based method to improve docking scoring functions and its application to drug repurposing.

Publications

A machine learning-based method to improve docking scoring functions and its application to drug repurposing.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein-Protein Docking Conformations.
| S-EPMC9855734 | biostudies-literature

Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions.
| S-EPMC9197983 | biostudies-literature

New machine learning and physics-based scoring functions for drug discovery.
| S-EPMC7862620 | biostudies-literature

Machine Learning Identifies Candidates for Drug Repurposing in Alzheimer's Disease
2021-01-14 | GSE164788 | GEO

Using machine learning to improve ensemble docking for drug discovery.
| S-EPMC7815257 | biostudies-literature

Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein-Ligand Structures: Towards Per-Target Scoring Functions.
| S-EPMC9966217 | biostudies-literature

Using informative features in machine learning based method for COVID-19 drug repurposing.
| S-EPMC8451172 | biostudies-literature

Drug Repurposing against KRAS Mutant G12C: A Machine Learning, Molecular Docking, and Molecular Dynamics Study.
| S-EPMC9821013 | biostudies-literature

Application of the 4D fingerprint method with a robust scoring function for scaffold-hopping and drug repurposing strategies.
| S-EPMC4210175 | biostudies-literature

Robustly interrogating machine learning-based scoring functions: what are they learning?
| S-EPMC11821266 | biostudies-literature