Unknown

Dataset Information

0

Intermediate divergence levels maximize the strength of structure-sequence correlations in enzymes and viral proteins.


ABSTRACT: Structural properties such as solvent accessibility and contact number predict site-specific sequence variability in many proteins. However, the strength and significance of these structure-sequence relationships vary widely among different proteins, with absolute correlation strengths ranging from 0 to 0.8. In particular, two recent works have made contradictory observations. Yeh et al. (Mol. Biol. Evol. 31:135-139, 2014) found that both relative solvent accessibility (RSA) and weighted contact number (WCN) are good predictors of sitewise evolutionary rate in enzymes, with WCN clearly out-performing RSA. Shahmoradi et al. (J. Mol. Evol. 79:130-142, 2014) considered these same predictors (as well as others) in viral proteins and found much weaker correlations and no clear advantage of WCN over RSA. Because these two studies had substantial methodological differences, however, a direct comparison of their results is not possible. Here, we reanalyze the datasets of the two studies with one uniform analysis pipeline, and we find that many apparent discrepancies between the two analyses can be attributed to the extent of sequence divergence in individual alignments. Specifically, the alignments of the enzyme dataset are much more diverged than those of the virus dataset, and proteins with higher divergence exhibit, on average, stronger structure-sequence correlations. However, the highest structure-sequence correlations are observed at intermediate divergence levels, where both highly conserved and highly variable sites are present in the same alignment.

SUBMITTER: Jackson EL 

PROVIDER: S-EPMC4918415 | biostudies-literature | 2016 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Intermediate divergence levels maximize the strength of structure-sequence correlations in enzymes and viral proteins.

Jackson Eleisha L EL   Shahmoradi Amir A   Spielman Stephanie J SJ   Jack Benjamin R BR   Wilke Claus O CO  

Protein science : a publication of the Protein Society 20160324 7


Structural properties such as solvent accessibility and contact number predict site-specific sequence variability in many proteins. However, the strength and significance of these structure-sequence relationships vary widely among different proteins, with absolute correlation strengths ranging from 0 to 0.8. In particular, two recent works have made contradictory observations. Yeh et al. (Mol. Biol. Evol. 31:135-139, 2014) found that both relative solvent accessibility (RSA) and weighted contact  ...[more]

Similar Datasets

| S-EPMC5812974 | biostudies-literature
2017-08-15 | GSE101952 | GEO
| S-EPMC7668092 | biostudies-literature
| S-EPMC311106 | biostudies-literature
| S-EPMC6220371 | biostudies-literature
2021-11-25 | GSE189312 | GEO
| S-EPMC7127120 | biostudies-literature
| S-EPMC4764941 | biostudies-literature
| S-EPMC4578012 | biostudies-literature
| S-EPMC6326188 | biostudies-literature