Prediction of Aqueous pK a Values for Guanidine-Containing Compounds Using Ab Initio Gas-Phase Equilibrium Bond Lengths.
Ontology highlight
ABSTRACT: In this work, we demonstrate the existence of linear relationships between gas-phase equilibrium bond lengths of the guanidine skeleton of 2-(arylamino)imidazolines and their aqueous pK a value. For a training set of 22 compounds, in the most stable conformation of their lowest energy tautomeric form, three bonds were found to exhibit r 2 and q 2 values >0.95 and root-mean-squared-error of estimation values ?0.25 when regressed individually against pK a. The equations describing these one-bond-length linear relationships, in addition to a multiple linear regression model using all three bond lengths, were then used to predict the experimental pK a values of an external test set of further 27 derivatives. The optimal protocol we derive here shows an overall mean absolute error (MAE) of 0.20 and standard deviation of errors of 0.18 for the test set. Predictions for a second test set of diphenyl-based bis(2-iminoimidazolidines) yielded an MAE of 0.27 and a standard deviation of 0.10. The predictive power of the optimal model is further demonstrated by its ability to correct erroneously reported experimental values. Finally, a previously established guanidine model is recalibrated at a new level of theory, and predictions are made for novel phenylguanidine derivatives, showing an MAE of just 0.29. The protocols established and tested here pass both of Roy's modern and stringent MAE-based criteria for a "good" quantitative structure-activity relationship/quantitative structure-property relationship model predictivity. Notably, the ab initio bond length high correlation subset protocol developed in this work demonstrates lower MAE values than the Marvin program by ChemAxon for all test sets.
SUBMITTER: Caine BA
PROVIDER: S-EPMC6641350 | biostudies-literature | 2018 Apr
REPOSITORIES: biostudies-literature
ACCESS DATA