Unknown

Dataset Information

0

Protein sequence randomness and sequence/structure correlations.


ABSTRACT: We investigated protein sequence/structure correlation by constructing a space of protein sequences, based on methods developed previously for constructing a space of protein structures. The space is constructed by using a representation of the amino acids as vectors of 10 property factors that encode almost all of their physical properties. Each sequence is represented by a distribution of overlapping sequence fragments. A distance between any two sequences can be calculated. By attaching a weight to each factor, intersequence distances can be varied. We optimize the correlation between corresponding distances in the sequence and structure spaces. The optimal correlation between the sequence and structure spaces is significantly better than that which results from correlating randomly generated sequences, having the overall composition of the data base, with the structure space. However, sets of randomly generated sequences, each of which approximates the composition of the real sequence it replaces, produce correlations with the structure space that are as good as that observed for the actual protein sequences. A connection is proposed with previous studies of the protein folding code. It is shown that the most important property factors for the correlation of the sequence and structure spaces are related to helix/bend preference, side chain bulk, and beta-structure preference.

SUBMITTER: Rahman RS 

PROVIDER: S-EPMC1282047 | biostudies-other | 1995 Apr

REPOSITORIES: biostudies-other

altmetric image

Publications

Protein sequence randomness and sequence/structure correlations.

Rahman R S RS   Rackovsky S S  

Biophysical journal 19950401 4


We investigated protein sequence/structure correlation by constructing a space of protein sequences, based on methods developed previously for constructing a space of protein structures. The space is constructed by using a representation of the amino acids as vectors of 10 property factors that encode almost all of their physical properties. Each sequence is represented by a distribution of overlapping sequence fragments. A distance between any two sequences can be calculated. By attaching a wei  ...[more]

Similar Datasets

| S-EPMC4880282 | biostudies-literature
| S-EPMC2527968 | biostudies-literature
| S-EPMC6863600 | biostudies-literature
| S-EPMC8348890 | biostudies-literature
| S-EPMC7986665 | biostudies-literature
| S-EPMC1995226 | biostudies-literature
| S-EPMC4918415 | biostudies-literature
| S-EPMC4319528 | biostudies-literature
| S-EPMC3795352 | biostudies-literature
| S-EPMC4570531 | biostudies-literature