Unknown

Dataset Information

0

Modeling sequence and function similarity between proteins for protein functional annotation.


ABSTRACT: A common task in biological research is to predict function for proteins by comparing sequences between proteins of known and unknown function. This is often done using pair-wise sequence alignment algorithms (e.g. BLAST). A problem with this approach is the assumption of a simple equivalence between a minimum sequence similarity threshold and the function similarity between proteins. This assumption is based on the binary concept of homology in that proteins are or not homologous. The relationship between sequence and function however is more complex as well as pertinent for predicting protein function, e.g. evaluating BLAST alignments or developing training sets for profile models based on functional rather than homologous groupings. Our motivation for this study was to model sequence and function similarity between proteins to gain insights into the "sequence-function similarity relationship between proteins for predicting function. Using our model we found that function similarity generally increases with sequence similarity but with a high degree of variability. This result has implications for pair-wise approaches in that it appears sequence similarity must be very high to ensure high function similarity. Profile models which enable higher sensitivity are a potential solution. However, multiple sequences alignments (a necessary prerequisite) are a problem in that current algorithms have difficulty aligning sequences with very low sequence similarity, which is common in our data set, or are intractable for high numbers of sequences. Given the importance of predicting protein function and the need for multiple sequence alignments, algorithms for accomplishing this task should be further refined and developed.

SUBMITTER: Higdon R 

PROVIDER: S-EPMC4120521 | biostudies-literature | 2010

REPOSITORIES: biostudies-literature

altmetric image

Publications

Modeling sequence and function similarity between proteins for protein functional annotation.

Higdon Roger R   Louie Brenton B   Kolker Eugene E  

Proceedings of the ... International Symposium on High Performance Distributed Computing 20100101


A common task in biological research is to predict function for proteins by comparing sequences between proteins of known and unknown function. This is often done using pair-wise sequence alignment algorithms (e.g. BLAST). A problem with this approach is the assumption of a simple equivalence between a minimum sequence similarity threshold and the <i>function</i> similarity between proteins. This assumption is based on the binary concept of homology in that proteins are or not homologous. The re  ...[more]

Similar Datasets

| S-EPMC3949165 | biostudies-literature
| S-EPMC1949826 | biostudies-literature
| S-EPMC10759460 | biostudies-literature
| S-EPMC4653902 | biostudies-literature
| S-EPMC8855713 | biostudies-literature
| S-EPMC2760442 | biostudies-literature
| S-EPMC4061105 | biostudies-literature
| S-EPMC6361244 | biostudies-literature
| S-EPMC2242527 | biostudies-literature
| S-EPMC2931517 | biostudies-literature