Unknown

Dataset Information

0

Extension of a local backbone description using a structural alphabet: a new approach to the sequence-structure relationship.


ABSTRACT: Protein Blocks (PBs) comprise a structural alphabet of 16 protein fragments, each 5 Calpha long. They make it possible to approximate and correctly predict local protein three-dimensional (3D) structures. We have selected the 72 most frequent sequences of five PBs, which we call Structural Words (SWs). Analysis of four different protein data banks shows that SWs cover 92% of the amino acids in them and provide a good structural approximation for residues (i.e., sequences) 9 Calpha long. We present most of them in a simple network that describes 90% of the overall residues and, interestingly, includes more than 80% of the amino acids present in coils. Analysis of the network shows the specificity and quality of the 3D descriptions as well as a new type of relation between local folds and amino acid distribution. The results show that the 3D structure of these protein data banks can be easily described by a combination of subgraphs included in the network. Finally, a Bayesian probabilistic approach improved the prediction rate by 4%.

SUBMITTER: de Brevern AG 

PROVIDER: S-EPMC2373739 | biostudies-literature | 2002 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Extension of a local backbone description using a structural alphabet: a new approach to the sequence-structure relationship.

de Brevern Alexandre G AG   Valadié Hélène H   Hazout Serge S   Etchebest Catherine C  

Protein science : a publication of the Protein Society 20021201 12


Protein Blocks (PBs) comprise a structural alphabet of 16 protein fragments, each 5 Calpha long. They make it possible to approximate and correctly predict local protein three-dimensional (3D) structures. We have selected the 72 most frequent sequences of five PBs, which we call Structural Words (SWs). Analysis of four different protein data banks shows that SWs cover 92% of the amino acids in them and provide a good structural approximation for residues (i.e., sequences) 9 Calpha long. We prese  ...[more]

Similar Datasets

| S-EPMC5697859 | biostudies-literature
| S-EPMC1538914 | biostudies-literature
| S-EPMC6033379 | biostudies-literature
| S-EPMC2315654 | biostudies-literature
| S-EPMC3722520 | biostudies-literature
| S-EPMC3841190 | biostudies-literature
| S-EPMC3144464 | biostudies-literature
| S-EPMC5768731 | biostudies-literature
| S-EPMC5953838 | biostudies-literature
| S-EPMC8136669 | biostudies-literature