Unknown

Dataset Information

0

Annotating proteins with generalized functional linkages.


ABSTRACT: As genome sequencing outstrips the rate of high-quality, low-throughput biochemical and genetic experimentation, accurate annotation of protein function becomes a bottleneck in the progress of the biomolecular sciences. Most gene products are now annotated by homology, in which an experimentally determined function is applied to a similar sequence. This procedure becomes error-prone between more divergent sequences and can contaminate biomolecular databases. Here, we propose a computational method of assignment of function, termed Generalized Functional Linkages (GFL), that combines nonhomology-based methods with other types of data. Functional linkages describe pairwise relationships between proteins that work together to perform a biological task. GFL provides a Bayesian framework that improves annotation by arbitrating a competition among biological process annotations to best describe the target protein. GFL addresses the unequal strengths of functional linkages among proteins, the quality of existing annotations, and the similarity among them while incorporating available knowledge about the cellular location or individual molecular function of the target protein. We demonstrate GFL with functional linkages defined by an algorithm known as zorch that quantifies connectivity in protein-protein interaction networks. Even when using proteins linked only by indirect or high-throughput interactions, GFL predicts the biological processes of many proteins in Saccharomyces cerevisiae, improving the accuracy of annotation by 20% over majority voting.

SUBMITTER: Llewellyn R 

PROVIDER: S-EPMC2584710 | biostudies-literature | 2008 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Annotating proteins with generalized functional linkages.

Llewellyn Richard R   Eisenberg David S DS  

Proceedings of the National Academy of Sciences of the United States of America 20081112 46


As genome sequencing outstrips the rate of high-quality, low-throughput biochemical and genetic experimentation, accurate annotation of protein function becomes a bottleneck in the progress of the biomolecular sciences. Most gene products are now annotated by homology, in which an experimentally determined function is applied to a similar sequence. This procedure becomes error-prone between more divergent sequences and can contaminate biomolecular databases. Here, we propose a computational meth  ...[more]

Similar Datasets

| S-EPMC4338801 | biostudies-literature
| S-EPMC3040500 | biostudies-literature
| S-EPMC8280847 | biostudies-literature
| S-EPMC3982924 | biostudies-literature
| S-EPMC3692504 | biostudies-literature
| S-EPMC8648855 | biostudies-literature
| S-EPMC5471146 | biostudies-literature
| S-EPMC1869010 | biostudies-literature
| S-EPMC5771479 | biostudies-literature
| PRJEB42571 | ENA