Unknown

Dataset Information

0

Towards a generalized toxicity prediction model for oxide nanomaterials using integrated data from different sources.


ABSTRACT: A generalized toxicity classification model for 7 different oxide nanomaterials is presented in this study. A data set extracted from multiple literature sources and screened by physicochemical property based quality scores were used for model development. Moreover, a few more preprocessing techniques, such as synthetic minority over-sampling technique, were applied to address the imbalanced class problem in the data set. Then, classification models using four different algorithms, such as generalized linear model, support vector machine, random forest, and neural network, were developed and their performances were compared to find the best performing preprocessing methods as well as algorithms. The neural network model built using the balanced data set was identified as the model with best predictive performance, while applicability domain was defined using k-nearest neighbours algorithm. The analysis of relative attribute importance for the built neural network model identified dose, formation enthalpy, exposure time, and hydrodynamic size as the four most important attributes. As the presented model can predict the toxicity of the nanomaterials in consideration of various experimental conditions, it has the advantage of having a broader and more general applicability domain than the existing quantitative structure-activity relationship model.

SUBMITTER: Choi JS 

PROVIDER: S-EPMC5904177 | biostudies-literature | 2018 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Towards a generalized toxicity prediction model for oxide nanomaterials using integrated data from different sources.

Choi Jang-Sik JS   Ha My Kieu MK   Trinh Tung Xuan TX   Yoon Tae Hyun TH   Byun Hyung-Gi HG  

Scientific reports 20180417 1


A generalized toxicity classification model for 7 different oxide nanomaterials is presented in this study. A data set extracted from multiple literature sources and screened by physicochemical property based quality scores were used for model development. Moreover, a few more preprocessing techniques, such as synthetic minority over-sampling technique, were applied to address the imbalanced class problem in the data set. Then, classification models using four different algorithms, such as gener  ...[more]

Similar Datasets

| S-EPMC5816655 | biostudies-literature
| S-EPMC4440714 | biostudies-literature
| S-EPMC6639370 | biostudies-literature
| S-EPMC6379402 | biostudies-literature
| S-EPMC3238178 | biostudies-literature
| S-EPMC8305450 | biostudies-literature
| S-EPMC7435363 | biostudies-literature
| S-EPMC7504913 | biostudies-literature
| S-EPMC7466583 | biostudies-literature
| S-EPMC2292694 | biostudies-literature