Unknown

Dataset Information

0

Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge.


ABSTRACT: In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty-how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided.

SUBMITTER: Bannan CC 

PROVIDER: S-EPMC5209301 | biostudies-literature | 2016 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge.

Bannan Caitlin C CC   Burley Kalistyn H KH   Chiu Michael M   Shirts Michael R MR   Gilson Michael K MK   Mobley David L DL  

Journal of computer-aided molecular design 20160927 11


In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculat  ...[more]

Similar Datasets

| S-EPMC5209288 | biostudies-literature
| S-EPMC5206288 | biostudies-literature
| S-EPMC5206257 | biostudies-literature
| S-EPMC5053177 | biostudies-literature
| S-EPMC5206264 | biostudies-literature
| S-EPMC5261860 | biostudies-literature
| S-EPMC8690632 | biostudies-literature
| S-EPMC8295120 | biostudies-literature
| S-EPMC8273033 | biostudies-literature
| S-EPMC7301889 | biostudies-literature