Unknown

Dataset Information

0

Protein domain recurrence and order can enhance prediction of protein functions.


ABSTRACT: Burgeoning sequencing technologies have generated massive amounts of genomic and proteomic data. Annotating the functions of proteins identified in this data has become a big and crucial problem. Various computational methods have been developed to infer the protein functions based on either the sequences or domains of proteins. The existing methods, however, ignore the recurrence and the order of the protein domains in this function inference.We developed two new methods to infer protein functions based on protein domain recurrence and domain order. Our first method, DRDO, calculates the posterior probability of the Gene Ontology terms based on domain recurrence and domain order information, whereas our second method, DRDO-NB, relies on the naïve Bayes methodology using the same domain architecture information. Our large-scale benchmark comparisons show strong improvements in the accuracy of the protein function inference achieved by our new methods, demonstrating that domain recurrence and order can provide important information for inference of protein functions.The new models are provided as open source programs at http://sfb.kaust.edu.sa/Pages/Software.aspx.dkihara@cs.purdue.edu, xin.gao@kaust.edu.saSupplementary data are available at Bioinformatics Online.

SUBMITTER: Messih MA 

PROVIDER: S-EPMC3436825 | biostudies-literature | 2012 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Protein domain recurrence and order can enhance prediction of protein functions.

Messih Mario Abdel MA   Chitale Meghana M   Bajic Vladimir B VB   Kihara Daisuke D   Gao Xin X  

Bioinformatics (Oxford, England) 20120901 18


<h4>Motivation</h4>Burgeoning sequencing technologies have generated massive amounts of genomic and proteomic data. Annotating the functions of proteins identified in this data has become a big and crucial problem. Various computational methods have been developed to infer the protein functions based on either the sequences or domains of proteins. The existing methods, however, ignore the recurrence and the order of the protein domains in this function inference.<h4>Results</h4>We developed two  ...[more]

Similar Datasets

| S-EPMC2657131 | biostudies-literature
2002-12-08 | GSE88 | GEO
| S-EPMC7479119 | biostudies-literature
| S-EPMC4481839 | biostudies-literature
| S-EPMC3584934 | biostudies-literature
| S-EPMC6720845 | biostudies-literature
| S-EPMC3057503 | biostudies-literature
| S-EPMC3532072 | biostudies-literature
| S-EPMC10917077 | biostudies-literature
| S-EPMC7665627 | biostudies-literature