Project description:The significant biological role of RNA has further highlighted the need for improving the accuracy, efficiency and the reach of methods for investigating RNA structure and function. Nuclear magnetic resonance (NMR) spectroscopy is vital to furthering the goals of RNA structural biology because of its distinctive capabilities. However, the dispersion pattern in the NMR spectra of RNA makes automated resonance assignment, a key step in NMR investigation of biomolecules, remarkably challenging. Herein we present RNA Probabilistic Assignment of Imino Resonance Shifts (RNA-PAIRS), a method for the automated assignment of RNA imino resonances with synchronized verification and correction of predicted secondary structure. RNA-PAIRS represents an advance in modeling the assignment paradigm because it seeds the probabilistic network for assignment with experimental NMR data, and predicted RNA secondary structure, simultaneously and from the start. Subsequently, RNA-PAIRS sets in motion a dynamic network that reverberates between predictions and experimental evidence in order to reconcile and rectify resonance assignments and secondary structure information. The procedure is halted when assignments and base-parings are deemed to be most consistent with observed crosspeaks. The current implementation of RNA-PAIRS uses an initial peak list derived from proton-nitrogen heteronuclear multiple quantum correlation ((1)H-(15)N 2D HMQC) and proton-proton nuclear Overhauser enhancement spectroscopy ((1)H-(1)H 2D NOESY) experiments. We have evaluated the performance of RNA-PAIRS by using it to analyze NMR datasets from 26 previously studied RNAs, including a 111-nucleotide complex. For moderately sized RNA molecules, and over a range of comparatively complex structural motifs, the average assignment accuracy exceeds 90%, while the average base pair prediction accuracy exceeded 93%. RNA-PAIRS yielded accurate assignments and base pairings consistent with imino resonances for a majority of the NMR resonances, even when the initial predictions are only modestly accurate. RNA-PAIRS is available as a public web-server at http://pine.nmrfam.wisc.edu/RNA/.
Project description:Due to their strong dependence on local atonic environments, NMR chemical shifts are among the most powerful tools for strucutre elucidation of powdered solids or amorphous materials. Unfortunately, using them for structure determination depends on the ability to calculate them, which comes at the cost of high accuracy first-principles calculations. Machine learning has recently emerged as a way to overcome the need for quantum chemical calculations, but for chemical shifts in solids it is hindered by the chemical and combinatorial space spanned by molecular solids, the strong dependency of chemical shifts on their environment, and the lack of an experimental database of shifts. We propose a machine learning method based on local environments to accurately predict chemical shifts of molecular solids and their polymorphs to within DFT accuracy. We also demonstrate that the trained model is able to determine, based on the match between experimentally measured and ML-predicted shifts, the structures of cocaine and the drug 4-[4-(2-adamantylcarbamoyl)-5-tert-butylpyrazol-1-yl]benzoic acid.
Project description:The Biological Magnetic Resonance Data Bank contains NMR chemical shift depositions for 132 RNAs and RNA-containing complexes. We have analyzed the (1)H NMR chemical shifts reported for non-exchangeable protons of residues that reside within A-form helical regions of these RNAs. The analysis focused on the central base pair within a stretch of three adjacent base pairs (BP triplets), and included both Watson-Crick (WC; G:C, A:U) and G:U wobble pairs. Chemical shift values were included for all 4(3) possible WC-BP triplets, as well as 137 additional triplets that contain one or more G:U wobbles. Sequence-dependent chemical shift correlations were identified, including correlations involving terminating base pairs within the triplets and canonical and non-canonical structures adjacent to the BP triplets (i.e. bulges, loops, WC and non-WC BPs), despite the fact that the NMR data were obtained under different conditions of pH, buffer, ionic strength, and temperature. A computer program (RNAShifts) was developed that enables convenient comparison of RNA (1)H NMR assignments with database predictions, which should facilitate future signal assignment/validation efforts and enable rapid identification of non-canonical RNA structures and RNA-ligand/protein interaction sites.
Project description:NMR-based crystallography approaches involving the combination of crystal structure prediction methods, ab initio calculated chemical shifts and solid-state NMR experiments are powerful methods for crystal structure determination of microcrystalline powders. However, currently structural information obtained from solid-state NMR is usually included only after a set of candidate crystal structures has already been independently generated, starting from a set of single-molecule conformations. Here, we show with the case of ampicillin that this can lead to failure of structure determination. We propose a crystal structure determination method that includes experimental constraints during conformer selection. In order to overcome the problem that experimental measurements on the crystalline samples are not obviously translatable to restrict the single-molecule conformational space, we propose constraints based on the analysis of absent cross-peaks in solid-state NMR correlation experiments. We show that these absences provide unambiguous structural constraints on both the crystal structure and the gas-phase conformations, and therefore can be used for unambiguous selection. The approach is parametrized on the crystal structure determination of flutamide, flufenamic acid, and cocaine, where we reduce the computational cost by around 50%. Most importantly, the method is then shown to correctly determine the crystal structure of ampicillin, which would have failed using current methods because it adopts a high-energy conformer in its crystal structure. The average positional RMSE on the NMR powder structure is ⟨rav⟩ = 0.176 Å, which corresponds to an average equivalent displacement parameter Ueq = 0.0103 Å2.
Project description:Bayesian and Maximum Entropy approaches allow for a statistically sound and systematic fitting of experimental and computational data. Unfortunately, assessing the relative confidence in these two types of data remains difficult as several steps add unknown error. Here we propose the use of a validation-set method to determine the balance, and thus the amount of fitting. We apply the method to synthetic NMR chemical shift data of an intrinsically disordered protein. We show that the method gives consistent results even when other methods to assess the amount of fitting cannot be applied. Finally, we also describe how the errors in the chemical shift predictor can lead to an incorrect fitting and how using secondary chemical shifts could alleviate this problem.
Project description:The heightened dipolar interactions in solids render solid-state NMR (ssNMR) spectra more difficult to interpret than solution NMR spectra. On the other hand, ssNMR does not suffer from severe molecular weight limitations like solution NMR. In recent years, ssNMR has undergone rapid technological developments that have enabled structure-function studies of increasingly larger biomolecules, including membrane proteins. Current methodology includes stable isotope labeling schemes, non-uniform sampling with spectral reconstruction, faster magic angle spinning, and innovative pulse sequences that capture different types of interactions among spins. However, computational tools for the analysis of complex ssNMR data from membrane proteins and other challenging protein systems have lagged behind those for solution NMR. Before a structure can be determined, thousands of signals from individual types of multidimensional ssNMR spectra of samples, which may have differing isotopic composition, must be recognized, correlated, categorized, and eventually assigned to atoms in the chemical structure. To address these tedious steps, we have developed an automated algorithm for ssNMR spectra called "ssPINE". The ssPINE software accepts the sequence of the protein plus peak lists from a variety of ssNMR experiments as inputs and offers automated backbone and side-chain assignments. The alpha version of ssPINE, which we describe here, is freely available through a web submission form.
Project description:In regions without complete-coverage civil registration and vital statistics systems there is uncertainty about even the most basic demographic indicators. In such regions the majority of deaths occur outside hospitals and are not recorded. Worldwide, fewer than one-third of deaths are assigned a cause, with the least information available from the most impoverished nations. In populations like this, verbal autopsy (VA) is a commonly used tool to assess cause of death and estimate cause-specific mortality rates and the distribution of deaths by cause. VA uses an interview with caregivers of the decedent to elicit data describing the signs and symptoms leading up to the death. This paper develops a new statistical tool known as InSilicoVA to classify cause of death using information acquired through VA. InSilicoVA shares uncertainty between cause of death assignments for specific individuals and the distribution of deaths by cause across the population. Using side-by-side comparisons with both observed and simulated data, we demonstrate that InSilicoVA has distinct advantages compared to currently available methods.
Project description:The Magnetoelectric (ME) effect in solids is a prominent cross correlation phenomenon, in which the electric field (E) controls the magnetization (M) and the magnetic field (H) controls the electric polarization (P). A rich variety of ME effects and their potential in practical applications have been investigated so far within the transition-metal compounds. Here, we report a possible way to realize the ME effect in organic molecular solids, in which two molecules build a dimer unit aligned on a lattice site. The linear ME effect is predicted in a long-range ordered state of spins and electric dipoles, as well as in a disordered state. One key of the ME effect is a hidden ferroic order of the spin-charge composite object. We provide a new guiding principle of the ME effect in materials without transition-metal elements, which may lead to flexible and lightweight multifunctional materials.
Project description:Alternative ('repeat') determinations of organic crystal structures deposited in the Cambridge Structural Database are analysed to characterise the nature and magnitude of the differences between structure solutions obtained by diffraction methods. Of the 3132 structure pairs considered, over 20% exhibited local structural differences exceeding 0.25 Å. In most cases (about 83%), structural optimisation using density functional theory (DFT) resolved the differences. Many of the cases where distinct and chemically significant structural differences remained after optimisation involved differently positioned hydroxyl groups, with obvious implications for the correct description of hydrogen bonding. 1H and 13C chemical shifts from solid-state NMR experiments are proposed as an independent methodology in cases where DFT optimisation fails to resolve discrepancies.
Project description:Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. Although the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well.