Unknown

Dataset Information

0

Large-scale protein function prediction using heterogeneous ensembles.


ABSTRACT: Heterogeneous ensembles are an effective approach in scenarios where the ideal data type and/or individual predictor are unclear for a given problem. These ensembles have shown promise for protein function prediction (PFP), but their ability to improve PFP at a large scale is unclear. The overall goal of this study is to critically assess this ability of a variety of heterogeneous ensemble methods across a multitude of functional terms, proteins and organisms. Our results show that these methods, especially Stacking using Logistic Regression, indeed produce more accurate predictions for a variety of Gene Ontology terms differing in size and specificity. To enable the application of these methods to other related problems, we have publicly shared the HPC-enabled code underlying this work as LargeGOPred ( https://github.com/GauravPandeyLab/LargeGOPred).

SUBMITTER: Wang L 

PROVIDER: S-EPMC6221071 | biostudies-literature | 2018

REPOSITORIES: biostudies-literature

altmetric image

Publications

Large-scale protein function prediction using heterogeneous ensembles.

Wang Linhua L   Law Jeffrey J   Kale Shiv D SD   Murali T M TM   Pandey Gaurav G  

F1000Research 20180928


Heterogeneous ensembles are an effective approach in scenarios where the ideal data type and/or individual predictor are unclear for a given problem. These ensembles have shown promise for protein function prediction (PFP), but their ability to improve PFP at a large scale is unclear. The overall goal of this study is to critically assess this ability of a variety of heterogeneous ensemble methods across a multitude of functional terms, proteins and organisms. Our results show that these methods  ...[more]

Similar Datasets

| S-EPMC3584181 | biostudies-literature
| S-EPMC4718788 | biostudies-literature
| S-EPMC6602452 | biostudies-literature
| S-EPMC8294856 | biostudies-literature
| S-EPMC2373757 | biostudies-literature
| S-EPMC2845582 | biostudies-literature
| S-EPMC1828618 | biostudies-literature
| S-EPMC10680702 | biostudies-literature
| S-EPMC6605767 | biostudies-literature
| S-EPMC5111120 | biostudies-literature