Unknown

Dataset Information

0

A confidence predictor for logD using conformal regression and a support-vector machine.


ABSTRACT: Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.

SUBMITTER: Lapins M 

PROVIDER: S-EPMC5882484 | biostudies-literature | 2018 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

A confidence predictor for logD using conformal regression and a support-vector machine.

Lapins Maris M   Arvidsson Staffan S   Lampa Samuel S   Berg Arvid A   Schaal Wesley W   Alvarsson Jonathan J   Spjuth Ola O  

Journal of cheminformatics 20180403 1


Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confiden  ...[more]

Similar Datasets

| S-EPMC2367560 | biostudies-literature
| S-EPMC6062547 | biostudies-literature
| S-EPMC3264588 | biostudies-other
| S-EPMC7029895 | biostudies-literature
| S-EPMC6514805 | biostudies-literature
| S-EPMC4395415 | biostudies-other
| S-EPMC4308892 | biostudies-other
| S-EPMC4243330 | biostudies-literature
| S-EPMC6262410 | biostudies-other
| S-EPMC3087589 | biostudies-other