Unknown

Dataset Information

0

ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules.


ABSTRACT: One of the grand challenges in modern theoretical chemistry is designing and implementing approximations that expedite ab initio methods without loss of accuracy. Machine learning (ML) methods are emerging as a powerful approach to constructing various forms of transferable atomistic potentials. They have been successfully applied in a variety of applications in chemistry, biology, catalysis, and solid-state physics. However, these models are heavily dependent on the quality and quantity of data used in their fitting. Fitting highly flexible ML potentials, such as neural networks, comes at a cost: a vast amount of reference data is required to properly train these models. We address this need by providing access to a large computational DFT database, which consists of more than 20 M off equilibrium conformations for 57,462 small organic molecules. We believe it will become a new standard benchmark for comparison of current and future methods in the ML potential community.

SUBMITTER: Smith JS 

PROVIDER: S-EPMC5735918 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC7195467 | biostudies-literature
| S-EPMC7313542 | biostudies-literature
| S-EPMC8468352 | biostudies-literature
| S-EPMC5671413 | biostudies-literature
| S-EPMC6855195 | biostudies-literature
| S-EPMC7895524 | biostudies-literature
| S-EPMC4755128 | biostudies-literature
| S-EPMC7239912 | biostudies-literature
| S-EPMC6262907 | biostudies-literature
| PRJNA921436 | ENA