Unknown

Dataset Information

0

Machine learning meets pK a.


ABSTRACT: We present a small molecule pK a prediction tool entirely written in Python. It predicts the macroscopic pK a value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validation (mean absolute error=0.682, root mean squared error=1.032, correlation coefficient r 2 =0.82). We test our model on two external validation sets, where our model performs comparable to Marvin and is better than a recently published open source model. Our Python tool and all data is freely available at https://github.com/czodrowskilab/Machine-learning-meets-pKa.

SUBMITTER: Baltruschat M 

PROVIDER: S-EPMC7096188 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

Machine learning meets pK <sub>a</sub>.

Baltruschat Marcel M   Czodrowski Paul P  

F1000Research 20200213


We present a small molecule pK <sub>a</sub> prediction tool entirely written in Python. It predicts the macroscopic pK <sub>a</sub> value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validation (mean absolute error=0.682, root mean squared error=1.032, correlation coefficient r <sup>2</sup> =0.82). We test our model on two external validation sets, where our model perform  ...[more]

Similar Datasets

| S-EPMC6137445 | biostudies-other
| S-EPMC6466966 | biostudies-literature
| S-EPMC5694768 | biostudies-literature
| S-EPMC10787197 | biostudies-literature
| S-EPMC6428342 | biostudies-literature
2013-01-01 | E-GEOD-29210 | biostudies-arrayexpress
| S-EPMC8779018 | biostudies-literature
2022-10-01 | GSE200096 | GEO
2023-06-01 | GSE193400 | GEO
| S-EPMC6403242 | biostudies-other