Unknown

Dataset Information

0

Correlated rigid modes in protein families.


ABSTRACT: A great deal of evolutionarily conserved information is contained in genomes and proteins. Enormous effort has been put into understanding protein structure and developing computational tools for protein folding, and many sophisticated approaches take structure and sequence homology into account. Several groups have applied statistical physics approaches to extracting information about proteins from sequences alone. Here, we develop a new method for sequence analysis based on first principles, in information theory, in statistical physics and in Bayesian analysis. We provide a complete derivation of our approach and we apply it to a variety of systems, to demonstrate its utility and its limitations. We show in some examples that phylogenetic alignments of amino-acid sequences of families of proteins imply the existence of a small number of modes that appear to be associated with correlated global variation. These modes are uncovered efficiently in our approach by computing a non-perturbative effective potential directly from the alignment. We show that this effective potential approaches a limiting form inversely with the logarithm of the number of sequences. Mapping symbol entropy flows along modes to underlying physical structures shows that these modes arise due to correlated compensatory adjustments. In the protein examples, these occur around functional binding pockets.

SUBMITTER: Striegel DA 

PROVIDER: S-EPMC6278828 | biostudies-literature | 2016 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Correlated rigid modes in protein families.

Striegel D A DA   Wojtowicz D D   Przytycka T M TM   Periwal V V  

Physical biology 20160411 2


A great deal of evolutionarily conserved information is contained in genomes and proteins. Enormous effort has been put into understanding protein structure and developing computational tools for protein folding, and many sophisticated approaches take structure and sequence homology into account. Several groups have applied statistical physics approaches to extracting information about proteins from sequences alone. Here, we develop a new method for sequence analysis based on first principles, i  ...[more]

Similar Datasets

| S-EPMC5013651 | biostudies-literature
| S-EPMC7484347 | biostudies-literature
| S-EPMC3415594 | biostudies-literature
| S-EPMC7183188 | biostudies-literature
| S-EPMC6774630 | biostudies-literature
| S-EPMC5359973 | biostudies-literature
| S-EPMC2327282 | biostudies-literature
| S-EPMC3849209 | biostudies-literature
| S-EPMC2817486 | biostudies-literature
| S-EPMC10214187 | biostudies-literature