Unknown

Dataset Information

0

ProLoc-GO: utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization.


ABSTRACT: BACKGROUND: Gene Ontology (GO) annotation, which describes the function of genes and gene products across species, has recently been used to predict protein subcellular and subnuclear localization. Existing GO-based prediction methods for protein subcellular localization use the known accession numbers of query proteins to obtain their annotated GO terms. An accurate prediction method for predicting subcellular localization of novel proteins without known accession numbers, using only the input sequence, is worth developing. RESULTS: This study proposes an efficient sequence-based method (named ProLoc-GO) by mining informative GO terms for predicting protein subcellular localization. For each protein, BLAST is used to obtain a homology with a known accession number to the protein for retrieving the GO annotation. A large number n of all annotated GO terms that have ever appeared are then obtained from a large set of training proteins. A novel genetic algorithm based method (named GOmining) combined with a classifier of support vector machine (SVM) is proposed to simultaneously identify a small number m out of the n GO terms as input features to SVM, where m <

SUBMITTER: Huang WL 

PROVIDER: S-EPMC2262056 | biostudies-literature | 2008

REPOSITORIES: biostudies-literature

altmetric image

Publications

ProLoc-GO: utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization.

Huang Wen-Lin WL   Tung Chun-Wei CW   Ho Shih-Wen SW   Hwang Shiow-Fen SF   Ho Shinn-Ying SY  

BMC bioinformatics 20080201


<h4>Background</h4>Gene Ontology (GO) annotation, which describes the function of genes and gene products across species, has recently been used to predict protein subcellular and subnuclear localization. Existing GO-based prediction methods for protein subcellular localization use the known accession numbers of query proteins to obtain their annotated GO terms. An accurate prediction method for predicting subcellular localization of novel proteins without known accession numbers, using only the  ...[more]

Similar Datasets

| S-EPMC2745392 | biostudies-literature
| S-EPMC2652875 | biostudies-literature
| S-EPMC4049835 | biostudies-literature
| S-EPMC3852282 | biostudies-literature
| S-EPMC2882390 | biostudies-literature
| S-EPMC2719631 | biostudies-literature
| S-EPMC3052763 | biostudies-literature
| S-EPMC6219567 | biostudies-literature
| S-EPMC3661659 | biostudies-literature
| S-EPMC9300714 | biostudies-literature