Unknown

Dataset Information

0

Constructing sequence-dependent protein models using coevolutionary information.


ABSTRACT: Recent developments in global statistical methodologies have advanced the analysis of large collections of protein sequences for coevolutionary information. Coevolution between amino acids in a protein arises from compensatory mutations that are needed to maintain the stability or function of a protein over the course of evolution. This gives rise to quantifiable correlations between amino acid sites within the multiple sequence alignment of a protein family. Here, we use the maximum entropy-based approach called mean field Direct Coupling Analysis (mfDCA) to infer a Potts model Hamiltonian governing the correlated mutations in a protein family. We use the inferred pairwise statistical couplings to generate the sequence-dependent heterogeneous interaction energies of a structure-based model (SBM) where only native contacts are considered. Considering the ribosomal S6 protein and its circular permutants as well as the SH3 protein, we demonstrate that these models quantitatively agree with experimental data on folding mechanisms. This work serves as a new framework for generating coevolutionary data-enriched models that can potentially be used to engineer key functional motions and novel interactions in protein systems.

SUBMITTER: Cheng RR 

PROVIDER: S-EPMC4815312 | biostudies-literature | 2016 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Constructing sequence-dependent protein models using coevolutionary information.

Cheng Ryan R RR   Raghunathan Mohit M   Noel Jeffrey K JK   Onuchic José N JN  

Protein science : a publication of the Protein Society 20150810 1


Recent developments in global statistical methodologies have advanced the analysis of large collections of protein sequences for coevolutionary information. Coevolution between amino acids in a protein arises from compensatory mutations that are needed to maintain the stability or function of a protein over the course of evolution. This gives rise to quantifiable correlations between amino acid sites within the multiple sequence alignment of a protein family. Here, we use the maximum entropy-bas  ...[more]

Similar Datasets

| S-EPMC4119721 | biostudies-other
| S-EPMC6992422 | biostudies-literature
| S-EPMC3014950 | biostudies-literature
| S-EPMC3918776 | biostudies-literature
| S-EPMC4151759 | biostudies-literature
| S-EPMC3322362 | biostudies-literature
| S-EPMC3695837 | biostudies-other
| S-EPMC2868838 | biostudies-literature
| S-EPMC1636350 | biostudies-literature
| S-EPMC5100047 | biostudies-literature