Unknown

Dataset Information

0

Comprehensive exploration of graphically defined reaction spaces.


ABSTRACT: Existing reaction transition state (TS) databases are comparatively small and lack chemical diversity. Here, this data gap has been addressed using the concept of a graphically-defined model reaction to comprehensively characterize a reaction space associated with C, H, O, and N containing molecules with up to 10 heavy (non-hydrogen) atoms. The resulting dataset is composed of 176,992 organic reactions possessing at least one validated TS, activation energy, heat of reaction, reactant and product geometries, frequencies, and atom-mapping. For 33,032 reactions, more than one TS was discovered by conformational sampling, allowing conformational errors in TS prediction to be assessed. Data is supplied at the GFN2-xTB and B3LYP-D3/TZVP levels of theory. A subset of reactions were recalculated at the CCSD(T)-F12/cc-pVDZ-F12 and ωB97X-D2/def2-TZVP levels to establish relative errors. The resulting collection of reactions and properties are called the Reaction Graph Depth 1 (RGD1) dataset. RGD1 represents the largest and most chemically diverse TS dataset published to date and should find immediate use in developing novel machine learning models for predicting reaction properties.

SUBMITTER: Zhao Q 

PROVIDER: S-EPMC10025260 | biostudies-literature | 2023 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Comprehensive exploration of graphically defined reaction spaces.

Zhao Qiyuan Q   Vaddadi Sai Mahit SM   Woulfe Michael M   Ogunfowora Lawal A LA   Garimella Sanjay S SS   Isayev Olexandr O   Savoie Brett M BM  

Scientific data 20230320 1


Existing reaction transition state (TS) databases are comparatively small and lack chemical diversity. Here, this data gap has been addressed using the concept of a graphically-defined model reaction to comprehensively characterize a reaction space associated with C, H, O, and N containing molecules with up to 10 heavy (non-hydrogen) atoms. The resulting dataset is composed of 176,992 organic reactions possessing at least one validated TS, activation energy, heat of reaction, reactant and produc  ...[more]

Similar Datasets

| S-EPMC7835425 | biostudies-literature
| S-EPMC2858107 | biostudies-literature
| S-EPMC4162468 | biostudies-literature
| S-EPMC9827825 | biostudies-literature
| S-EPMC9986954 | biostudies-literature
| S-EPMC4404516 | biostudies-literature
| S-EPMC9881050 | biostudies-literature
| S-EPMC2925302 | biostudies-literature
| S-EPMC10450414 | biostudies-literature
| S-EPMC3024942 | biostudies-literature