Unknown

Dataset Information

0

Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks.


ABSTRACT: MOTIVATIONS:Protein function prediction is an important and challenging problem in bioinformatics and computational biology. Functionally relevant biological information such as protein sequences, gene expression, and protein-protein interactions has been used mostly separately for protein function prediction. One of the major challenges is how to effectively integrate multiple sources of both traditional and new information such as spatial gene-gene interaction networks generated from chromosomal conformation data together to improve protein function prediction. RESULTS:In this work, we developed three different probabilistic scores (MIS, SEQ, and NET score) to combine protein sequence, function associations, and protein-protein interaction and spatial gene-gene interaction networks for protein function prediction. The MIS score is mainly generated from homologous proteins found by PSI-BLAST search, and also association rules between Gene Ontology terms, which are learned by mining the Swiss-Prot database. The SEQ score is generated from protein sequences. The NET score is generated from protein-protein interaction and spatial gene-gene interaction networks. These three scores were combined in a new Statistical Multiple Integrative Scoring System (SMISS) to predict protein function. We tested SMISS on the data set of 2011 Critical Assessment of Function Annotation (CAFA). The method performed substantially better than three base-line methods and an advanced method based on protein profile-sequence comparison, profile-profile comparison, and domain co-occurrence networks according to the maximum F-measure.

SUBMITTER: Cao R 

PROVIDER: S-EPMC4894840 | biostudies-literature | 2016 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks.

Cao Renzhi R   Cheng Jianlin J  

Methods (San Diego, Calif.) 20150911


<h4>Motivations</h4>Protein function prediction is an important and challenging problem in bioinformatics and computational biology. Functionally relevant biological information such as protein sequences, gene expression, and protein-protein interactions has been used mostly separately for protein function prediction. One of the major challenges is how to effectively integrate multiple sources of both traditional and new information such as spatial gene-gene interaction networks generated from c  ...[more]

Similar Datasets

| S-EPMC4392065 | biostudies-literature
| S-EPMC4074043 | biostudies-literature
| S-EPMC5753374 | biostudies-literature
| S-EPMC2311305 | biostudies-literature
| S-EPMC3524085 | biostudies-other
| S-EPMC4391834 | biostudies-literature
| S-EPMC3086830 | biostudies-literature
| S-EPMC4489281 | biostudies-literature
| S-EPMC3966033 | biostudies-literature
| S-EPMC6650051 | biostudies-literature