Unknown

Dataset Information

0

On the origin and highly likely completeness of single-domain protein structures.


ABSTRACT: The size and origin of the protein fold universe is of fundamental and practical importance. Analyzing randomly generated, compact sticky homopolypeptide conformations constructed in generic simplified and all-atom protein models, all have similar folds in the library of solved structures, the Protein Data Bank, and conversely, all compact, single-domain protein structures in the Protein Data Bank have structural analogues in the compact model set. Thus, both sets are highly likely complete, with the protein fold universe arising from compact conformations of hydrogen-bonded, secondary structures. Because side chains are represented by their Cbeta atoms, these results also suggest that the observed protein folds are insensitive to the details of side-chain packing. Sequence specificity enters both in fine-tuning the structure and thermodynamically stabilizing a given fold with respect to the set of alternatives. Scanning the models against a three-dimensional active-site library, close geometric matches are frequently found. Thus, the presence of active-site-like geometries also seems to be a consequence of the packing of compact, secondary structural elements. These results have significant implications for the evolution of protein structure and function.

SUBMITTER: Zhang Y 

PROVIDER: S-EPMC1413790 | biostudies-literature | 2006 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

On the origin and highly likely completeness of single-domain protein structures.

Zhang Yang Y   Hubner Isaac A IA   Arakaki Adrian K AK   Shakhnovich Eugene E   Skolnick Jeffrey J  

Proceedings of the National Academy of Sciences of the United States of America 20060214 8


The size and origin of the protein fold universe is of fundamental and practical importance. Analyzing randomly generated, compact sticky homopolypeptide conformations constructed in generic simplified and all-atom protein models, all have similar folds in the library of solved structures, the Protein Data Bank, and conversely, all compact, single-domain protein structures in the Protein Data Bank have structural analogues in the compact model set. Thus, both sets are highly likely complete, wit  ...[more]

Similar Datasets

| S-EPMC3351587 | biostudies-other
| S-EPMC1538855 | biostudies-literature
| S-EPMC3610613 | biostudies-literature
| S-EPMC8223206 | biostudies-literature
| S-EPMC4699654 | biostudies-literature
| S-EPMC6802138 | biostudies-literature
| S-EPMC3107222 | biostudies-literature
| S-EPMC1635331 | biostudies-literature
| S-EPMC8831596 | biostudies-literature
| S-EPMC4367253 | biostudies-literature