Unknown

Dataset Information

0

Sequence comparison by sequence harmony identifies subtype-specific functional sites.


ABSTRACT: Multiple sequence alignments are often used to reveal functionally important residues within a protein family. They can be particularly useful for the identification of key residues that determine functional differences between protein subfamilies. We present a new entropy-based method, Sequence Harmony (SH) that accurately detects subfamily-specific positions from a multiple sequence alignment. The SH algorithm implements a novel formula, able to score compositional differences between subfamilies, without imposing conservation, in a simple manner on an intuitive scale. We compare our method with the most important published methods, i.e. AMAS, TreeDet and SDP-pred, using three well-studied protein families: the receptor-binding domain (MH2) of the Smad family of transcription factors, the Ras-superfamily of small GTPases and the MIP-family of integral membrane transporters. We demonstrate that SH accurately selects known functional sites with higher coverage than the other methods for these test-cases. This shows that compositional differences between protein subfamilies provide sufficient basis for identification of functional sites. In addition, SH selects a number of sites of unknown function that could be interesting candidates for further experimental investigation.

SUBMITTER: Pirovano W 

PROVIDER: S-EPMC1702503 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC1933219 | biostudies-literature
| S-EPMC2896201 | biostudies-literature
| S-EPMC5559341 | biostudies-literature
| S-EPMC1280257 | biostudies-literature
| S-EPMC6952364 | biostudies-literature
| S-EPMC5494361 | biostudies-literature
| S-EPMC4104576 | biostudies-literature
| S-EPMC10891092 | biostudies-literature
| S-EPMC4917084 | biostudies-literature
| S-EPMC9851297 | biostudies-literature