Dataset Information

(φ,ψ)₂ motifs: a purely conformation-based fine-grained enumeration of protein parts at the two-residue level.

ABSTRACT: A deep understanding of protein structure benefits from the use of a variety of classification strategies that enhance our ability to effectively describe local patterns of conformation. Here, we use a clustering algorithm to analyze 76,533 all-trans segments from protein structures solved at 1.2 Å resolution or better to create a purely φ,ψ-based comprehensive empirical categorization of common conformations adopted by two adjacent φ,ψ pairs (i.e., (φ,ψ)(2) motifs). The clustering algorithm works in an origin-shifted four-dimensional space based on the two φ,ψ pairs to yield a parameter-dependent list of (φ,ψ)(2) motifs, in order of their prominence. The results are remarkably distinct from and complementary to the standard hydrogen-bond-centered view of secondary structure. New insights include an unprecedented level of precision in describing the φ,ψ angles of both previously known and novel motifs, ordering of these motifs by their population density, a data-driven recommendation that the standard C(α(i))…C(α(i+3))<7 Å criteria for defining turns be changed to 6.5 Å, identification of β-strand and turn capping motifs, and identification of conformational capping by residues in polypeptide II conformation. We further document that the conformational preferences of a residue are substantially influenced by the conformation of its neighbors, and we suggest that accounting for these dependencies will improve protein modeling accuracy. Although the CUEVAS-4D(r(10)є(14)) 'parts list' presented here is only an initial exploration of the complex (φ,ψ)(2) landscape of proteins, it shows that there is value to be had from this approach, and it opens the door to more in-depth characterizations at the (φ,ψ)(2) level and at higher dimensions.

SUBMITTER: Hollingsworth SA

PROVIDER: S-EPMC3268948 | biostudies-literature | 2012 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

(φ,ψ)₂ motifs: a purely conformation-based fine-grained enumeration of protein parts at the two-residue level.

Hollingsworth Scott A SA Lewis Matthew C MC Berkholz Donald S DS Wong Weng-Keen WK Karplus P Andrew PA

Journal of molecular biology 20111216 1

A deep understanding of protein structure benefits from the use of a variety of classification strategies that enhance our ability to effectively describe local patterns of conformation. Here, we use a clustering algorithm to analyze 76,533 all-trans segments from protein structures solved at 1.2 Å resolution or better to create a purely φ,ψ-based comprehensive empirical categorization of common conformations adopted by two adjacent φ,ψ pairs (i.e., (φ,ψ)(2) motifs). The clustering algorithm wor ...[more]

PMID: 22198294

Similar Datasets

Project description:We present an automated method for modeling backbones of protein loops. The method samples a database of phi i + 1 and psi i angles constructed from a nonredundant version of the Protein Data Bank (PDB). The dihedral angles phi i + 1 and psi i completely define the backbone conformation of a dimer when standard bond lengths, bond angles, and a trans planar peptide configuration are used. For the 400 possible dimers resulting from 20 natural amino acids, a list of allowed phi i + 1, psi i pairs for each dimer is created by pooling all such pairs from the loop segments of each protein in the nonredundant version of the PDB. Starting from the N-terminus of the loop sequence, conformations are generated by assigning randomly selected pairs of phi i + 1, psi i for each dimer from the respective pool using standard bond lengths, bond angles, and a trans peptide configuration. We use this database to simulate protein loops of lengths varying from 5 to 11 amino acids in five proteins of known three-dimensional structures. Typically, 10,000-50,000 models are simulated for each protein loop and are evaluated for stereochemical consistency. Depending on the length and sequence of a given loop, 50-80% of the models generated have no stereochemical strain in the backbone atoms. We demonstrate that, when simulated loops are extended to include flanking residues from homologous segments, only very few loops from an ensemble of sterically allowed conformations orient the flanking segments consistent with the protein topology. The presence of near-native backbone conformations for loops from five different proteins suggests the completeness of the dimeric database for use in modeling loops of homologous proteins. Here, we take advantage of this observation to design a method that filters near-native loop conformations from an ensemble of sterically allowed conformations. We demonstrate that our method eliminates the need for a loop-closure algorithm and hence allows for the use of topological constraints of the homologous proteins or disulfide constraints to filter near-native loop conformations.

Dataset Information

(φ,ψ)₂ motifs: a purely conformation-based fine-grained enumeration of protein parts at the two-residue level.

Publications

(φ,ψ)₂ motifs: a purely conformation-based fine-grained enumeration of protein parts at the two-residue level.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets