Project description:Residual dipolar coupling (RDC) represents one of the most exciting emerging NMR techniques for protein structure studies. However, solving a protein structure using RDC data alone is still a highly challenging problem. We report here a computer program, RDC-PROSPECT, for protein structure prediction based on a structural homolog or analog of the target protein in the Protein Data Bank (PDB), which best aligns with the (15)N-(1)H RDC data of the protein recorded in a single ordering medium. Since RDC-PROSPECT uses only RDC data and predicted secondary structure information, its performance is virtually independent of sequence similarity between a target protein and its structural homolog/analog, making it applicable to protein targets beyond the scope of current protein threading techniques. We have tested RDC-PROSPECT on all (15)N-(1)H RDC data (representing 43 proteins) deposited in the BioMagResBank (BMRB) database. The program correctly identified structural folds for 83.7% of the target proteins, and achieved an average alignment accuracy of 98.1% residues within a four-residue shift.
Project description:While 3D structure determination of small (<15 kDa) proteins by solution NMR is largely automated and routine, structural analysis of larger proteins is more challenging. An emerging hybrid strategy for modeling protein structures combines sparse NMR data that can be obtained for larger proteins with sequence co-variation data, called evolutionary couplings (ECs), obtained from multiple sequence alignments of protein families. This hybrid "EC-NMR" method can be used to accurately model larger (15-60 kDa) proteins, and more rapidly determine structures of smaller (5-15 kDa) proteins using only backbone NMR data. The resulting structures have accuracies relative to reference structures comparable to those obtained with full backbone and sidechain NMR resonance assignments. The requirement that evolutionary couplings (ECs) are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, potentially also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.
Project description:RNAs fold into distinct molecular conformations that are often essential for their functions. Accurate structure modeling of complex RNA motifs, including ubiquitous non-canonical base pairs and pseudoknots, remains a challenge. Here, we present an NMR-guided all-atom discrete molecular dynamics (DMD) platform, iFoldNMR, for rapid and accurate structure modeling of complex RNAs. We show that sparse distance constraints from imino resonances, which can be readily obtained from routine NMR experiments and easier to compile than laborious assignments of non-solvent-exchangeable protons, are sufficient to direct a DMD search for low-energy RNA conformers. Benchmarking on a set of RNAs with complex folds spanning up to 56 nucleotides in length yields structural models that recapitulate experimentally determined structures with all-heavy-atom RMSDs ranging from 2.4 to 6.5 Å. This platform represents an efficient approach for high-throughput RNA structure modeling and will facilitate analysis of diverse, newly discovered functional RNAs.
Project description:The quality of protein structures determined by nuclear magnetic resonance (NMR) spectroscopy is contingent on the number and quality of experimentally-derived resonance assignments, distance and angular restraints. Two key features of protein NMR data have posed challenges for the routine and automated structure determination of small to medium sized proteins; (1) spectral resolution - especially of crowded nuclear Overhauser effect spectroscopy (NOESY) spectra, and (2) the reliance on a continuous network of weak scalar couplings as part of most common assignment protocols. In order to facilitate NMR structure determination, we developed a semi-automated strategy that utilizes non-uniform sampling (NUS) and multidimensional decomposition (MDD) for optimal data collection and processing of selected, high resolution multidimensional NMR experiments, combined it with an ABACUS protocol for sequential and side chain resonance assignments, and streamlined this procedure to execute structure and refinement calculations in CYANA and CNS, respectively. Two graphical user interfaces (GUIs) were developed to facilitate efficient analysis and compilation of the data and to guide automated structure determination. This integrated method was implemented and refined on over 30 high quality structures of proteins ranging from 5.5 to 16.5 kDa in size.
Project description:Accurate determination of protein structure by NMR spectroscopy is challenging for larger proteins, for which experimental data are often incomplete and ambiguous. Evolutionary sequence information together with advances in maximum entropy statistical methods provide a rich complementary source of structural constraints. We have developed a hybrid approach (evolutionary coupling-NMR spectroscopy; EC-NMR) combining sparse NMR data with evolutionary residue-residue couplings and demonstrate accurate structure determination for several proteins 6-41 kDa in size.
Project description:Conventional NMR structure determination requires nearly complete assignment of the cross peaks of a refined NOESY peak list. Depending on the size of the protein and quality of the spectral data, this can be a time-consuming manual process requiring several rounds of peak list refinement and structure determination. Programs such as Aria, CYANA, and AutoStructure can generate models using unassigned NOESY data but are very sensitive to the quality of the input peak lists and can converge to inaccurate structures if the signal-to-noise of the peak lists is low. Here, we show that models with high accuracy and reliability can be produced by combining the strengths of the high-resolution structure prediction program Rosetta with global measures of the agreement between structure models and experimental data. A first round of models generated using CS-Rosetta (Rosetta supplemented with backbone chemical shift information) are filtered on the basis of their goodness-of-fit with unassigned NOESY peak lists using the DP-score, and the best fitting models are subjected to high resolution refinement with the Rosetta rebuild-and-refine protocol. This hybrid approach uses both local backbone chemical shift and the unassigned NOESY data to direct Rosetta trajectories toward the native structure and produces more accurate models than AutoStructure/CYANA or CS-Rosetta alone, particularly when using raw unedited NOESY peak lists. We also show that when accurate manually refined NOESY peak lists are available, Rosetta refinement can consistently increase the accuracy of models generated using CYANA and AutoStructure.
Project description:A solid-state NMR approach for simultaneous resonance assignment and three-dimensional structure determination of a membrane protein in lipid bilayers is described. The approach is based on the scattering, hence the descriptor "shotgun," of (15)N-labeled amino acids throughout the protein sequence (and the resulting NMR spectra). The samples are obtained by protein expression in bacteria grown on media in which one type of amino acid is labeled and the others are not. Shotgun NMR short-circuits the laborious and time-consuming process of obtaining complete sequential assignments prior to the calculation of a protein structure from the NMR data by taking advantage of the orientational information inherent to the spectra of aligned proteins. As a result, it is possible to simultaneously assign resonances and measure orientational restraints for structure determination. A total of five two-dimensional (1)H/(15)N PISEMA (polarization inversion spin exchange at the magic angle) spectra, from one uniformly and four selectively (15)N-labeled samples, were sufficient to determine the structure of the membrane-bound form of the 50-residue major pVIII coat protein of fd filamentous bacteriophage. Pisa (polarity index slat angle) wheels are an essential element in the process, which starts with the simultaneous assignment of resonances and the assembly of isolated polypeptide segments, and culminates in the complete three-dimensional structure of the protein with atomic resolution. The principles are also applicable to weakly aligned proteins studied by solution NMR spectroscopy. [The structure we determined for the membrane-bound form of the Fd bacteriophage pVIII coat protein has been deposited in the Protein Data Bank as PDB file 1MZT.]
Project description:The J-UNIO (JCSG protocol using the software UNIO) procedure for automated protein structure determination by NMR in solution is introduced. In the present implementation, J-UNIO makes use of APSY-NMR spectroscopy, 3D heteronuclear-resolved [(1)H,(1)H]-NOESY experiments, and the software UNIO. Applications with proteins from the JCSG target list with sizes up to 150 residues showed that the procedure is highly robust and efficient. In all instances the correct polypeptide fold was obtained in the first round of automated data analysis and structure calculation. After interactive validation of the data obtained from the automated routine, the quality of the final structures was comparable to results from interactive structure determination. Special advantages are that the NMR data have been recorded with 6-10 days of instrument time per protein, that there is only a single step of chemical shift adjustments to relate the backbone signals in the APSY-NMR spectra with the corresponding backbone signals in the NOESY spectra, and that the NOE-based amino acid side chain chemical shift assignments are automatically focused on those residues that are heavily weighted in the structure calculation. The individual working steps of J-UNIO are illustrated with the structure determination of the protein YP_926445.1 from Shewanella amazonensis, and the results obtained with 17 JCSG targets are critically evaluated.
Project description:We introduce AUDANA (Automated Database-Assisted NOE Assignment), an algorithm for determining three-dimensional structures of proteins from NMR data that automates the assignment of 3D-NOE spectra, generates distance constraints, and conducts iterative high temperature molecular dynamics and simulated annealing. The protein sequence, chemical shift assignments, and NOE spectra are the only required inputs. Distance constraints generated automatically from ambiguously assigned NOE peaks are validated during the structure calculation against information from an enlarged version of the freely available PACSY database that incorporates information on protein structures deposited in the Protein Data Bank (PDB). This approach yields robust sets of distance constraints and 3D structures. We evaluated the performance of AUDANA with input data for 14 proteins ranging in size from 6 to 25 kDa that had 27-98 % sequence identity to proteins in the database. In all cases, the automatically calculated 3D structures passed stringent validation tests. Structures were determined with and without database support. In 9/14 cases, database support improved the agreement with manually determined structures in the PDB and in 11/14 cases, database support lowered the r.m.s.d. of the family of 20 structural models.