Project description:The majority of primary and secondary metabolites in nature have yet to be identified, representing a major challenge for metabolomics studies that currently require reference libraries from analyses of authentic compounds. Using currently available analytical methods, complete chemical characterization of metabolomes is infeasible for both technical and economic reasons. For example, unambiguous identification of metabolites is limited by the availability of authentic chemical standards, which, for the majority of molecules, do not exist. Computationally predicted or calculated data are a viable solution to expand the currently limited metabolite reference libraries, if such methods are shown to be sufficiently accurate. For example, determining nuclear magnetic resonance (NMR) spectroscopy spectra in silico has shown promise in the identification and delineation of metabolite structures. Many researchers have been taking advantage of density functional theory (DFT), a computationally inexpensive yet reputable method for the prediction of carbon and proton NMR spectra of metabolites. However, such methods are expected to have some error in predicted 13C and 1H NMR spectra with respect to experimentally measured values. This leads us to the question-what accuracy is required in predicted 13C and 1H NMR chemical shifts for confident metabolite identification? Using the set of 11,716 small molecules found in the Human Metabolome Database (HMDB), we simulated both experimental and theoretical NMR chemical shift databases. We investigated the level of accuracy required for identification of metabolites in simulated pure and impure samples by matching predicted chemical shifts to experimental data. We found 90% or more of molecules in simulated pure samples can be successfully identified when errors of 1H and 13C chemical shifts in water are below 0.6 and 7.1 ppm, respectively, and below 0.5 and 4.6 ppm in chloroform solvation, respectively. In simulated complex mixtures, as the complexity of the mixture increased, greater accuracy of the calculated chemical shifts was required, as expected. However, if the number of molecules in the mixture is known, e.g., when NMR is combined with MS and sample complexity is low, the likelihood of confident molecular identification increased by 90%.
Project description:We present two open-source datasets that provide time-dependent density-functional tight-binding (TD-DFTB) electronic excitation spectra of organic molecules. These datasets represent predictions of UV-vis absorption spectra performed on optimized geometries of the molecules in their electronic ground state. The GDB-9-Ex dataset contains a subset of 96,766 organic molecules from the original open-source GDB-9 dataset. The ORNL_AISD-Ex dataset consists of 10,502,904 organic molecules that contain between 5 and 71 non-hydrogen atoms. The data reveals the close correlation between the magnitude of the gaps between the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO), and the excitation energy of the lowest singlet excited state energies quantitatively. The chemical variability of the large number of molecules was examined with a topological fingerprint estimation based on extended-connectivity fingerprints (ECFPs) followed by uniform manifold approximation and projection (UMAP) for dimension reduction. Both datasets were generated using the DFTB+ software on the "Andes" cluster of the Oak Ridge Leadership Computing Facility (OLCF).
Project description:The vibrational analysis of the gas-phase infrared spectra of chlorofluoromethane (CH2ClF, HCFC-31) was carried out in the range 200-6200 cm(-1). The assignment of the absorption features in terms of fundamental, overtone, combination, and hot bands was performed on the medium-resolution (up to 0.2 cm(-1)) Fourier transform infrared spectra. From the absorption cross section spectra accurate values of the integrated band intensities were derived and the global warming potential of this compound was estimated, thus obtaining values of 323, 83, and 42 on a 20-, 100-, and 500-year horizon, respectively. The set of spectroscopic parameters here presented provides the basic data to model the atmospheric behavior of this greenhouse gas. In addition, the obtained vibrational properties were used to benchmark the predictions of state-of-the-art quantum-chemical computational strategies. Extrapolated complete basis set limit values for the equilibrium geometry and harmonic force field were obtained at the coupled-cluster singles and doubles level of theory augmented by a perturbative treatment of triple excitations, CCSD(T), in conjunction with a hierarchical series of correlation-consistent basis sets (cc-pVnZ, with n = T, Q, and 5), taking also into account the core-valence correlation effects and the corrections due to diffuse (aug) functions. To obtain the cubic and quartic semi-diagonal force constants, calculations employing second-order Møller-Plesset perturbation (MP2) theory, the double-hybrid density functional B2PLYP as well as CCSD(T) were performed. For all anharmonic force fields the performances of two different perturbative approaches in computing the vibrational energy levels (i.e., the generalized second order vibrational treatment, GVPT2, and the recently proposed hybrid degeneracy corrected model, HDCPT2) were evaluated and the obtained results allowed us to validate the spectroscopic predictions yielded by the HDCPT2 approach. The predictions of the deperturbed second-order perturbation approach, DVPT2, applied to the computation of infrared intensities beyond the double-harmonic approximation were compared to the accurate experimental values here determined. Anharmonic DFT and MP2 corrections to CCSD(T) intensities led to a very good agreement with the absorption cross section measurements over the whole spectral range here analysed.
Project description:We introduce a fully stand-alone version of the Quantum Chemistry Electron Ionization Mass Spectra (QCEIMS) program [S. Grimme, Angew. Chem. Int. Ed., 2013, 52, 6306] allowing efficient simulations for molecules composed of elements with atomic numbers up to Z = 86. The recently developed extended tight-binding semi-empirical method GFN-xTB has been combined with QCEIMS, thereby eliminating dependencies on third-party electronic structure software. Furthermore, for reasonable calculations of ionization potentials, as required by the method, a second tight-binding variant, IPEA-xTB, is introduced here. This novel combination of methods allows the automatic, fast and reasonably accurate computation of electron ionization mass spectra for structurally different molecules across the periodic table. In order to validate and inspect the transferability of the method, we perform large-scale simulations for some representative organic, organometallic, and main-group inorganic systems. Theoretical spectra for 23 molecules are compared directly to experimental data taken from standard databases. For the first time, realistic quantum chemistry based EI-MS for organometallic systems like ferrocene or copper(ii)acetylacetonate are presented. Compared to previously used semiempirical methods, GFN-xTB is faster, more robust, and yields overall higher quality spectra. The partially analysed theoretical reaction and fragmentation mechanisms are chemically reasonable and reveal in unprecedented detail the extreme complexity of high energy gas phase ion chemistry including complicated rearrangement reactions prior to dissociation.
Project description:We have studied the Fourier Transform Infrared (FT-IR) and the Fourier transform Raman (FT-Raman) spectra of stanozolol and oxandrolone, and we have performed quantum chemical calculations based on the density functional theory (DFT) with a B3LYP/6-31G (d, p) level of theory. The FT-IR and FT-Raman spectra were collected in a solid phase. The consistency between the calculated and experimental FT-IR and FT-Raman data indicates that the B3LYP/6-31G (d, p) can generate reliable geometry and related properties of the title compounds. Selected experimental bands were assigned and characterized on the basis of the scaled theoretical wavenumbers by their total energy distribution. The good agreement between the experimental and theoretical spectra allowed positive assignment of the observed vibrational absorption bands. Finally, the calculation results were applied to simulate the Raman and IR spectra of the title compounds, which show agreement with the observed spectra.
Project description:Anisotropic quantum nanostructures have attracted a lot of attention due to their unique properties and a range of potential applications. Magnetic circular dichroism (MCD) spectra of semiconductor CdSe/ZnS Quantum Rods and CdSe/CdS Dot-in-Rods have been studied. Positions of four electronic transitions were determined by data fitting. MCD spectra were analyzed in the A and B terms, which characterize the splitting and mixing of states. Effective values of A and B terms were determined for each transition. A relatively high value of the B term is noted, which is most likely associated with the anisotropy of quantum rods.
Project description:Chemical derivatization, especially silylation, is widely used in gas chromatography coupled to mass spectrometry (GC-MS). By introducing the trimethylsilyl (TMS) group to substitute active hydrogens in the molecule, thermostable volatile compounds are created that can be easily analyzed. While large GC-MS libraries are available, the number of spectra for TMS-derivatized compounds is comparatively small. In addition, many metabolites cannot be purchased to produce authentic library spectra. Therefore, computationally generated in silico mass spectral databases need to take TMS derivatizations into account for metabolomics. The quantum chemistry method QCEIMS is an automatic method to generate electron ionization (EI) mass spectra directly from compound structures. To evaluate the performance of the QCEIMS method for TMS-derivatized compounds, we chose 816 trimethylsilyl derivatives of organic acids, alcohols, amides, amines, and thiols to compare in silico-generated spectra against the experimental EI mass spectra from the NIST17 library. Overall, in silico spectra showed a weighted dot score similarity (1000 is maximum) of 635 compared to the NIST17 experimental spectra. Aromatic compounds yielded a better prediction accuracy with an average similarity score of 808, while oxygen-containing molecules showed lower accuracy with only an average score of 609. Such similarity scores are useful for annotation of small molecules in untargeted GC-MS-based metabolomics, suggesting that QCEIMS methods can be extended to compounds that are not present in experimental databases. Despite this overall success, 37% of all experimentally observed ions were not found in QCEIMS predictions. We investigated QCEIMS trajectories in detail and found missed fragmentations in specific rearrangement reactions. Such findings open the way forward for future improvements to the QCEIMS software.
Project description:The origin of the high-frequency shoulder (HFS) observed above the longitudinal optical (LO) peak around 230 cm-1 in the Raman spectra of CdSe quantum dots (QDs) has been the subject of intense debate. We use state-of-the-art ab initio density functional theory applied to small CdSe QDs with various realistic surface passivations and find an intense Raman signal around 230 cm-1, which corresponds to a stretching vibration of a defective 2-fold coordinated Se atom. We interpret this signal as being the origin of the HFS. Since the signal disappears in fully passivated and defect-free (magic size cluster) structures, it can be used as a fingerprint to distinguish defective from nondefective structures.
Project description:Raman spectrometers will form a key component of the analytical suite of future planetary rovers intended to investigate geological processes on Mars. In order to expand the applicability of these spectrometers and use them as analytical tools for the investigation of silicate glasses, a database correlating Raman spectra to glass composition is crucial. Here we investigate the effect of the chemical composition of reduced silicate glasses on their Raman spectra. A range of compositions was generated in a diffusion experiment between two distinct, iron-rich end-members (a basalt and a peralkaline rhyolite), which are representative of the anticipated compositions of Martian rocks. Our results show that for silica-poor (depolymerized) compositions the band intensity increases dramatically in the regions between 550-780 cm-1 and 820-980 cm-1. On the other hand, Raman spectra regions between 250-550 cm-1 and 1000-1250 cm-1 are well developed in silica-rich (highly polymerized) systems. Further, spectral intensity increases at ~965 cm-1 related to the high iron content of these glasses (~7-17 wt % of FeOtot). Based on the acquired Raman spectra and an ideal mixing equation between the two end-members we present an empirical parameterization that enables the estimation of the chemical compositions of silicate glasses within this range. The model is validated using external samples for which chemical composition and Raman spectra were characterized independently. Applications of this model range from microanalysis of dry and hydrous silicate glasses (e.g., melt inclusions) to in situ field investigations and studies under extreme conditions such as extraterrestrial (i.e., Mars) and submarine volcanic environments.
Project description:The energies of the 133 000 molecules in the GDB-9 database have been calculated at the G4MP2 level of theory and then were used to calculate their enthalpies of formation. This database contains organic molecules having nine or less atoms of carbon, nitrogen, oxygen, and fluorine, as well as hydrogen atoms. The accuracy of the G4MP2 energies was investigated on a subset of 459 of the molecules having experimental enthalpies of formation with small uncertainties. On this subset the G4MP2 enthalpies of formation have an accuracy of 0.79 kcal mol-1, which is similar to its accuracy previously reported for the smaller G3/05 test set. An error analysis of the theoretical enthalpies of formation of the 459 molecules is presented in terms of the size and type of the molecules. Three different density functionals (B3LYP, ωB97X-D, M06-2X) were also assessed on 459 molecules of accurate enthalpy data for comparison with the G4MP2 results. The G4MP2 energies for the 133 K molecules provide a database that can be used to calculate accurate reaction energies as well as to assess new or existing experimental enthalpies of formation. Several examples are given of types of reactions that can be predicted using the G4MP2 database of energies. The G4MP2 energies of the GDB-9 molecules will also be useful in future investigations of applications of machine learning to quantum chemical data.