COVER: conformational oversampling as data augmentation for molecules.
Ontology highlight
ABSTRACT: Training neural networks with small and imbalanced datasets often leads to overfitting and disregard of the minority class. For predictive toxicology, however, models with a good balance between sensitivity and specificity are needed. In this paper we introduce conformational oversampling as a means to balance and oversample datasets for prediction of toxicity. Conformational oversampling enhances a dataset by generation of multiple conformations of a molecule. These conformations can be used to balance, as well as oversample a dataset, thereby increasing the dataset size without the need of artificial samples. We show that conformational oversampling facilitates training of neural networks and provides state-of-the-art results on the Tox21 dataset.
SUBMITTER: Hemmerich J
PROVIDER: S-EPMC7080709 | biostudies-literature |
REPOSITORIES: biostudies-literature
ACCESS DATA