Unknown

Dataset Information

0

ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules.


ABSTRACT: One of the grand challenges in modern theoretical chemistry is designing and implementing approximations that expedite ab initio methods without loss of accuracy. Machine learning (ML) methods are emerging as a powerful approach to constructing various forms of transferable atomistic potentials. They have been successfully applied in a variety of applications in chemistry, biology, catalysis, and solid-state physics. However, these models are heavily dependent on the quality and quantity of data used in their fitting. Fitting highly flexible ML potentials, such as neural networks, comes at a cost: a vast amount of reference data is required to properly train these models. We address this need by providing access to a large computational DFT database, which consists of more than 20 M off equilibrium conformations for 57,462 small organic molecules. We believe it will become a new standard benchmark for comparison of current and future methods in the ML potential community.

SUBMITTER: Smith JS 

PROVIDER: S-EPMC5735918 | biostudies-literature | 2017 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules.

Smith Justin S JS   Isayev Olexandr O   Roitberg Adrian E AE  

Scientific data 20171219


One of the grand challenges in modern theoretical chemistry is designing and implementing approximations that expedite ab initio methods without loss of accuracy. Machine learning (ML) methods are emerging as a powerful approach to constructing various forms of transferable atomistic potentials. They have been successfully applied in a variety of applications in chemistry, biology, catalysis, and solid-state physics. However, these models are heavily dependent on the quality and quantity of data  ...[more]

Similar Datasets

| S-EPMC7195467 | biostudies-literature
| S-EPMC7313542 | biostudies-literature
| S-EPMC10688188 | biostudies-literature
| S-EPMC8468352 | biostudies-literature
| S-EPMC5671413 | biostudies-literature
| S-EPMC11577314 | biostudies-literature
| S-EPMC10647024 | biostudies-literature
| S-EPMC6855195 | biostudies-literature
| S-EPMC7895524 | biostudies-literature
| S-EPMC4755128 | biostudies-literature