Dataset Information

Quantum-mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?

ABSTRACT: Joint academic-industrial projects supporting drug discovery are frequently pursued to deploy and benchmark cutting-edge methodical developments from academia in a real-world industrial environment at different scales. The dimensionality of tasks ranges from small molecule physicochemical property assessment over protein-ligand interaction up to statistical analyses of biological data. This way, method development and usability both benefit from insights gained at both ends, when predictiveness and readiness of novel approaches are confirmed, but the pharmaceutical drug makers get early access to novel tools for the quality of drug products and benefit of patients. Quantum-mechanical and simulation methods particularly fall into this group of methods, as they require skills and expense in their development but also significant resources in their application, thus are comparatively slowly dripping into the realm of industrial use. Nevertheless, these physics-based methods are becoming more and more useful. Starting with a general overview of these and in particular quantum-mechanical methods for drug discovery we review a decade-long and ongoing collaboration between Sanofi and the Kast group focused on the application of the embedded cluster reference interaction site model (EC-RISM), a solvation model for quantum chemistry, to study small molecule chemistry in the context of joint participation in several SAMPL (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenges. Starting with early application to tautomer equilibria in water (SAMPL2) the methodology was further developed to allow for challenge contributions related to predictions of distribution coefficients (SAMPL5) and acidity constants (SAMPL6) over the years. Particular emphasis is put on a frequently overlooked aspect of measuring the quality of models, namely the retrospective analysis of earlier datasets and predictions in light of more recent and advanced developments. We therefore demonstrate the performance of the current methodical state of the art as developed and optimized for the SAMPL6 pK_a and octanol-water log P challenges when re-applied to the earlier SAMPL5 cyclohexane-water log D and SAMPL2 tautomer equilibria datasets. Systematic improvement is not consistently found throughout despite the similarity of the problem class, i.e. protonation reactions and phase distribution. Hence, it is possible to learn about hidden bias in model assessment, as results derived from more elaborate methods do not necessarily improve quantitative agreement. This indicates the role of chance or coincidence for model development on the one hand which allows for the identification of systematic error and opportunities toward improvement and reveals possible sources of experimental uncertainty on the other. These insights are particularly useful for further academia-industry collaborations, as both partners are then enabled to optimize both the computational and experimental settings for data generation.

SUBMITTER: Tielker N

PROVIDER: S-EPMC8018924 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Dataset Information

Quantum-mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Analyzing Learned Molecular Representations for Property Prediction.
| S-EPMC6727618 | biostudies-literature

Regio-selectivity prediction with a machine-learned reaction representation and on-the-fly quantum mechanical descriptors.
| S-EPMC8179287 | biostudies-literature

QMugs, quantum mechanical properties of drug-like molecules.
| S-EPMC9174255 | biostudies-literature

Site-Level Bioactivity of Small-Molecules from Deep-Learned Representations of Quantum Chemistry.
| S-EPMC8716316 | biostudies-literature

Mechanistic Insights into Enzyme Catalysis from Explaining Machine-Learned Quantum Mechanical and Molecular Mechanical Minimum Energy Pathways.
| S-EPMC9344433 | biostudies-literature

Evaluating the impact of prediction models: lessons learned, challenges, and recommendations.
| S-EPMC6460651 | biostudies-literature

Data-driven quantum chemical property prediction leveraging 3D conformations with Uni-Mol.
| S-EPMC11333583 | biostudies-literature

Interactions between large molecules pose a puzzle for reference quantum mechanical methods.
| S-EPMC8225865 | biostudies-literature

Simulating the ghost: quantum dynamics of the solvated electron.
| S-EPMC7859219 | biostudies-literature

A quantum chemical molecular dynamics repository of solvated ions.
| S-EPMC9304403 | biostudies-literature