Project description:Rosetta is one of the prime tools for high resolution protein structure refinement. While its scoring function can distinguish native-like from non-native-like conformations in many cases, the method is limited by conformational sampling for larger proteins, that is, leaving a local energy minimum in which the search algorithm may get stuck. Here, we test the hypothesis that iteration of Rosetta with an orthogonal sampling and scoring strategy might facilitate exploration of conformational space. Specifically, we run short molecular dynamics (MD) simulations on models created by de novo folding of large proteins into cryoEM density maps to enable sampling of conformational space not directly accessible to Rosetta and thus provide an escape route from the conformational traps. We present a combined MD-Rosetta protein structure refinement protocol that can overcome some of these sampling limitations. Two of four benchmark proteins showed incremental improvement through all three rounds of the iterative refinement protocol. Molecular dynamics is most efficient in applying subtle but important rearrangements within secondary structure elements and is thus highly complementary to the Rosetta refinement, which focuses on side chains and loop regions.
Project description:Nearly all the macromolecular three-dimensional structures deposited in Protein Data Bank were determined by either crystallographic (X-ray) or Nuclear Magnetic Resonance (NMR) spectroscopic methods. This paper reports a systematic comparison of the crystallographic and NMR results deposited in the files of the Protein Data Bank, in order to find out to which extent these information can be aggregated in bioinformatics. A non-redundant data set containing 109 NMR - X-ray structure pairs of nearly identical proteins was derived from the Protein Data Bank. A series of comparisons were performed by focusing the attention towards both global features and local details. It was observed that: (1) the RMDS values between NMR and crystal structures range from about 1.5 Å to about 2.5 Å; (2) the correlation between conformational deviations and residue type reveals that hydrophobic amino acids are more similar in crystal and NMR structures than hydrophilic amino acids; (3) the correlation between solvent accessibility of the residues and their conformational variability in solid state and in solution is relatively modest (correlation coefficient = 0.462); (4) beta strands on average match better between NMR and crystal structures than helices and loops; (5) conformational differences between loops are independent of crystal packing interactions in the solid state; (6) very seldom, side chains buried in the protein interior are observed to adopt different orientations in the solid state and in solution.
Project description:About 5% of the disulfide bonds (DBs) observed in the Protein Data Bank bridge two protein chains. Several of their features were comprehensively analyzed, resulting in a structural atlas of the intermolecular DBs. The analysis was performed on a very large set of data extracted from the Protein Data Bank, according to the RaSPDB procedure. It was observed that the two chains tend to have different sequences and belong to the same structural class. Intermolecular DBs tend to be more solvent accessible and less distorted from the most stable conformation than intermolecular DBs while showing similar B-factors. They tend to occur in beta strands and in mainly-beta structures. These and other data should prove useful in protein modelling and design.
Project description:Crystals of many important biological macromolecules diffract to limited resolution, rendering accurate model building and refinement difficult and time-consuming. We present a torsional optimization protocol that is applicable to many such situations and combines Protein Data Bank-based torsional optimization with real-space refinement against the electron density derived from crystallography or cryo-electron microscopy. Our method converts moderate- to low-resolution structures at initial (e.g., backbone trace only) or late stages of refinement to structures with increased numbers of hydrogen bonds, improved crystallographic R-factors, and superior backbone geometry. This automated method is applicable to DNA-binding and membrane proteins of any size and will aid studies of structural biology by improving model quality and saving considerable effort. The method can be extended to improve NMR and other structures. Our backbone score and its sequence profile provide an additional standard tool for evaluating structural quality.
Project description:Membrane proteins are challenging to study and restraints for structure determination are typically sparse or of low resolution because the membrane environment that surrounds them leads to a variety of experimental challenges. When membrane protein structures are determined by different techniques in different environments, a natural question is "which structure is most biologically relevant?" Towards answering this question, we compiled a dataset of membrane proteins with known structures determined by both solution NMR and X-ray crystallography. By investigating differences between the structures, we found that RMSDs between crystal and NMR structures are below 5 Å in the membrane region, NMR ensembles have a higher convergence in the membrane region, crystal structures typically have a straighter transmembrane region, have higher stereo-chemical correctness, and are more tightly packed. After quantifying these differences, we used high-resolution refinement of the NMR structures to mitigate them, which paves the way for identifying and improving the structural quality of membrane proteins.
Project description:Since the ratio of the number of observations to adjustable parameters is small at low resolution, it is necessary to use complementary information for the analysis of such data. ProSMART is a program that can generate restraints for macromolecules using homologous structures, as well as generic restraints for the stabilization of secondary structures. These restraints are used by REFMAC5 to stabilize the refinement of an atomic model. However, the optimal refinement protocol varies from case to case, and it is not always obvious how to select appropriate homologous structure(s), or other sources of prior information, for restraint generation. After running extensive tests on a large data set of low-resolution models, the best-performing refinement protocols and strategies for the selection of homologous structures have been identified. These strategies and protocols have been implemented in the Low-Resolution Structure Refinement (LORESTR) pipeline. The pipeline performs auto-detection of twinning and selects the optimal scaling method and solvent parameters. LORESTR can either use user-supplied homologous structures, or run an automated BLAST search and download homologues from the PDB. The pipeline executes multiple model-refinement instances using different parameters in order to find the best protocol. Tests show that the automated pipeline improves R factors, geometry and Ramachandran statistics for 94% of the low-resolution cases from the PDB included in the test set.
Project description:The quality of X-ray crystallographic models for biomacromolecules refined from data obtained at high-resolution is assured by the data itself. However, at low-resolution, >3.0 Å, additional information is supplied by a forcefield coupled with an associated refinement protocol. These resulting structures are often of lower quality and thus unsuitable for downstream activities like structure-based drug discovery.An X-ray crystallography refinement protocol that enhances standard methodology by incorporating energy terms from the HINT (Hydropathic INTeractions) empirical forcefield is described. This protocol was tested by refining synthetic low-resolution structural data derived from 25 diverse high-resolution structures, and referencing the resulting models to these structures. The models were also evaluated with global structural quality metrics, e.g., Ramachandran score and MolProbity clashscore. Three additional structures, for which only low-resolution data are available, were also re-refined with this methodology.The enhanced refinement protocol is most beneficial for reflection data at resolutions of 3.0 Å or worse. At the low-resolution limit, ?4.0 Å, the new protocol generated models with C? positions that have RMSDs that are 0.18 Å more similar to the reference high-resolution structure, Ramachandran scores improved by 13%, and clashscores improved by 51%, all in comparison to models generated with the standard refinement protocol. The hydropathic forcefield terms are at least as effective as Coulombic electrostatic terms in maintaining polar interaction networks, and significantly more effective in maintaining hydrophobic networks, as synthetic resolution is decremented. Even at resolutions ?4.0 Å, these latter networks are generally native-like, as measured with a hydropathic interactions scoring tool.
Project description:All-atom models are essential for many applications in molecular modeling and computational chemistry. Nonbonded atomic contacts much closer than the sum of the van der Waals radii of the two atoms (clashes) are commonly observed in such models derived from protein crystal structures. A set of 94 recently deposited protein structures in the resolution range 1.5-2.8 Å were analyzed for clashes by the addition of all H atoms to the models followed by optimization and energy minimization of the positions of just these H atoms. The results were compared with the same set of structures after automated all-atom refinement with PrimeX and with nonbonded contacts in protein crystal structures at a resolution equal to or better than 0.9 Å. The additional PrimeX refinement produced structures with reasonable summary geometric statistics and similar R(free) values to the original structures. The frequency of clashes at less than 0.8 times the sum of van der Waals radii was reduced over fourfold compared with that found in the original structures, to a level approaching that found in the ultrahigh-resolution structures. Moreover, severe clashes at less than or equal to 0.7 times the sum of atomic radii were reduced 15-fold. All-atom refinement with PrimeX produced improved crystal structure models with respect to nonbonded contacts and yielded changes in structural details that dramatically impacted on the interpretation of some protein-ligand interactions.
Project description:The Worldwide PDB recently launched a deposition, biocuration, and validation tool: OneDep. At various stages of OneDep data processing, validation reports for three-dimensional structures of biological macromolecules are produced. These reports are based on recommendations of expert task forces representing crystallography, nuclear magnetic resonance, and cryoelectron microscopy communities. The reports provide useful metrics with which depositors can evaluate the quality of the experimental data, the structural model, and the fit between them. The validation module is also available as a stand-alone web server and as a programmatically accessible web service. A growing number of journals require the official wwPDB validation reports (produced at biocuration) to accompany manuscripts describing macromolecular structures. Upon public release of the structure, the validation report becomes part of the public PDB archive. Geometric quality scores for proteins in the PDB archive have improved over the past decade.
Project description:The structural study of icosahedral viruses has a long and impactful history in both crystallographic methodology and molecular biology. The evolution of the Protein Data Bank has paralleled and supported these studies providing readily accessible formats dealing with novel features associated with viral particle symmetries and subunit interactions. This overview describes the growth in size and complexity of icosahedral viruses from the first early studies of small RNA plant viruses and human picornaviruses up to the larger and more complex bacterial phage, insect, and human disease viruses such as Zika, hepatitis B, Adeno and Polyoma virus. The analysis of icosahedral viral capsid protein domain folds has shown striking similarities, with the beta jelly roll motif observed across multiple evolutionarily divergent species. The icosahedral symmetry of viruses drove the development of noncrystallographic symmetry averaging as a powerful phasing method, and the constraints of maintaining this symmetry resulted in the concept of quasi-equivalence in viral structures. Symmetry also played an important early role in demonstrating the power of cryo-electron microscopy as an alternative to crystallography in generating atomic resolution structures of these viruses. The Protein Data Bank has been a critical resource for assembling and disseminating these structures to a wide community, and the virus particle explorer (VIPER) was developed to enable users to easily generate and view complete viral capsid structures from their asymmetric building blocks. Finally, we share a personal perspective on the early use of computer graphics to communicate the intricacies, interactions, and beauty of these virus structures.