Unknown

Dataset Information

0

Machine Learning Consensus Scoring Improves Performance Across Targets in Structure-Based Virtual Screening.


ABSTRACT: In structure-based virtual screening, compound ranking through a consensus of scores from a variety of docking programs or scoring functions, rather than ranking by scores from a single program, provides better predictive performance and reduces target performance variability. Here we compare traditional consensus scoring methods with a novel, unsupervised gradient boosting approach. We also observed increased score variation among active ligands and developed a statistical mixture model consensus score based on combining score means and variances. To evaluate performance, we used the common performance metrics ROCAUC and EF1 on 21 benchmark targets from DUD-E. Traditional consensus methods, such as taking the mean of quantile normalized docking scores, outperformed individual docking methods and are more robust to target variation. The mixture model and gradient boosting provided further improvements over the traditional consensus methods. These methods are readily applicable to new targets in academic research and overcome the potentially poor performance of using a single docking method on a new target.

SUBMITTER: Ericksen SS 

PROVIDER: S-EPMC5872818 | biostudies-literature | 2017 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Machine Learning Consensus Scoring Improves Performance Across Targets in Structure-Based Virtual Screening.

Ericksen Spencer S SS   Wu Haozhen H   Zhang Huikun H   Michael Lauren A LA   Newton Michael A MA   Hoffmann F Michael FM   Wildman Scott A SA  

Journal of chemical information and modeling 20170712 7


In structure-based virtual screening, compound ranking through a consensus of scores from a variety of docking programs or scoring functions, rather than ranking by scores from a single program, provides better predictive performance and reduces target performance variability. Here we compare traditional consensus scoring methods with a novel, unsupervised gradient boosting approach. We also observed increased score variation among active ligands and developed a statistical mixture model consens  ...[more]

Similar Datasets

| S-EPMC5404222 | biostudies-literature
| S-EPMC9941253 | biostudies-literature
| S-EPMC9570399 | biostudies-literature
| S-EPMC10207375 | biostudies-literature
| S-EPMC10320911 | biostudies-literature
| S-EPMC10743990 | biostudies-literature
| S-EPMC5774846 | biostudies-literature
| S-EPMC7187304 | biostudies-literature
| S-EPMC10999096 | biostudies-literature
| S-EPMC8865842 | biostudies-literature