Unknown

Dataset Information

0

COVER: conformational oversampling as data augmentation for molecules.


ABSTRACT: Training neural networks with small and imbalanced datasets often leads to overfitting and disregard of the minority class. For predictive toxicology, however, models with a good balance between sensitivity and specificity are needed. In this paper we introduce conformational oversampling as a means to balance and oversample datasets for prediction of toxicity. Conformational oversampling enhances a dataset by generation of multiple conformations of a molecule. These conformations can be used to balance, as well as oversample a dataset, thereby increasing the dataset size without the need of artificial samples. We show that conformational oversampling facilitates training of neural networks and provides state-of-the-art results on the Tox21 dataset.

SUBMITTER: Hemmerich J 

PROVIDER: S-EPMC7080709 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC7579987 | biostudies-literature
| S-EPMC8176531 | biostudies-literature
| S-EPMC6275108 | biostudies-literature
| S-EPMC6431826 | biostudies-literature
| S-EPMC5238442 | biostudies-literature
| S-EPMC7180474 | biostudies-literature
| S-EPMC4126733 | biostudies-literature
| S-EPMC8628853 | biostudies-literature
| S-EPMC8136837 | biostudies-literature
| S-EPMC7671995 | biostudies-literature