Unknown

Dataset Information

0

Predicting drug-induced liver injury: The importance of data curation.


ABSTRACT: Drug-induced liver injury (DILI) is a major issue for both patients and pharmaceutical industry due to insufficient means of prevention/prediction. In the current work we present a 2-class classification model for DILI, generated with Random Forest and 2D molecular descriptors on a dataset of 966 compounds. In addition, predicted transporter inhibition profiles were also included into the models. The initially compiled dataset of 1773 compounds was reduced via a 2-step approach to 966 compounds, resulting in a significant increase (p-value<0.05) in model performance. The models have been validated via 10-fold cross-validation and against three external test sets of 921, 341 and 96 compounds, respectively. The final model showed an accuracy of 64% (AUC 68%) for 10-fold cross-validation (average of 50 iterations) and comparable values for two test sets (AUC 59%, 71% and 66%, respectively). In the study we also examined whether the predictions of our in-house transporter inhibition models for BSEP, BCRP, P-glycoprotein, and OATP1B1 and 1B3 contributed in improvement of the DILI mode. Finally, the model was implemented with open-source 2D RDKit descriptors in order to be provided to the community as a Python script.

SUBMITTER: Kotsampasakou E 

PROVIDER: S-EPMC6422282 | biostudies-literature | 2017 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Predicting drug-induced liver injury: The importance of data curation.

Kotsampasakou Eleni E   Montanari Floriane F   Ecker Gerhard F GF  

Toxicology 20170623


Drug-induced liver injury (DILI) is a major issue for both patients and pharmaceutical industry due to insufficient means of prevention/prediction. In the current work we present a 2-class classification model for DILI, generated with Random Forest and 2D molecular descriptors on a dataset of 966 compounds. In addition, predicted transporter inhibition profiles were also included into the models. The initially compiled dataset of 1773 compounds was reduced via a 2-step approach to 966 compounds,  ...[more]

Similar Datasets

| S-EPMC7702310 | biostudies-literature
2014-01-22 | E-GEOD-54257 | biostudies-arrayexpress
| S-EPMC10412691 | biostudies-literature
2022-07-01 | MSV000089782 | MassIVE
2014-01-22 | GSE54257 | GEO
| S-EPMC6706305 | biostudies-literature
| S-EPMC7542839 | biostudies-literature
| S-EPMC8416433 | biostudies-literature
| S-EPMC3681898 | biostudies-literature
| S-EPMC5500850 | biostudies-literature