Unknown

Dataset Information

0

Predicting Protein Functions Based on Differential Co-expression and Neighborhood Analysis.


ABSTRACT: Proteins are polypeptides essential in biological processes. Protein physical interactions are complemented by other types of functional relationship data including genetic interactions, knowledge about co-expression, and evolutionary pathways. Existing algorithms integrate protein interaction and gene expression data to retrieve context-specific subnetworks composed of genes/proteins with known and unknown functions. However, most protein function prediction algorithms fail to exploit diverse intrinsic information in feature and label spaces. We develop a novel integrative method based on differential Co-expression analysis and Neighbor-voting algorithm for Protein Function Prediction, namely CNPFP. The method integrates heterogeneous data and exploits intrinsic and latent linkages via global iterative approach and genomic features. CNPFP performs three tasks: clustering, differential co-expression analysis, and predicts protein functions. Our aim is to identify yeast cell cycle-specific proteins linked to differentially expressed proteins in the protein-protein interaction network. To capture intrinsic information, CNPFP selects the most relevant feature subset based on global iterative neighbor-voting algorithm. We identify eight condition-specific modules. The most relevant subnetwork has 87 genes highly enriched with cyclin-dependent kinases, a protein kinase relevant for cell cycle regulation. We present comprehensive annotations for 3538 Saccharomyces cerevisiae proteins. Our method achieves an AUROC of 0.9862, accuracy of 0.9710, and F-score of 0.9691. From the results, we can summarize that exploiting intrinsic nature of protein relationships improves the quality of function prediction. Thus, the proposed method is useful in functional genomics studies.

SUBMITTER: Wekesa JS 

PROVIDER: S-EPMC8030663 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4887825 | biostudies-literature
| S-EPMC4449974 | biostudies-literature
| 2431957 | ecrin-mdr-crc
| S-EPMC5751780 | biostudies-literature
| S-EPMC5342040 | biostudies-literature
| S-EPMC2212665 | biostudies-literature
| S-EPMC7561188 | biostudies-literature
| S-EPMC6375203 | biostudies-literature
| S-EPMC3695838 | biostudies-literature
| S-EPMC5482204 | biostudies-literature