Unknown

Dataset Information

0

Integrating phenotype and gene expression data for predicting gene function.


ABSTRACT: BACKGROUND: This paper presents a framework for integrating disparate data sets to predict gene function. The algorithm constructs a graph, called an integrated similarity graph, by computing similarities based upon both gene expression and textual phenotype data. This integrated graph is then used to make predictions about whether individual genes should be assigned a particular annotation from the Gene Ontology. RESULTS: A combined graph was generated from publicly-available gene expression data and phenotypic information from Saccharomyces cerevisiae. This graph was used to assign annotations to genes, as were graphs constructed from gene expression data and textual phenotype information alone. While the F-measure appeared similar for all three methods, annotations based upon the integrated similarity graph exhibited a better overall precision than gene expression or phenotype information alone can generate. The integrated approach was also able to assign almost as many annotations as the gene expression method alone, and generated significantly more total and correct assignments than the phenotype information could provide. CONCLUSION: These results suggest that augmenting standard gene expression data sets with publicly-available textual phenotype data can help generate more precise functional annotation predictions while mitigating the weaknesses of a standard textual phenotype approach.

SUBMITTER: Malone BM 

PROVIDER: S-EPMC3226192 | biostudies-literature | 2009

REPOSITORIES: biostudies-literature

altmetric image

Publications

Integrating phenotype and gene expression data for predicting gene function.

Malone Brandon M BM   Perkins Andy D AD   Bridges Susan M SM  

BMC bioinformatics 20091008


<h4>Background</h4>This paper presents a framework for integrating disparate data sets to predict gene function. The algorithm constructs a graph, called an integrated similarity graph, by computing similarities based upon both gene expression and textual phenotype data. This integrated graph is then used to make predictions about whether individual genes should be assigned a particular annotation from the Gene Ontology.<h4>Results</h4>A combined graph was generated from publicly-available gene  ...[more]

Similar Datasets

| S-EPMC5192994 | biostudies-literature
| S-EPMC3338016 | biostudies-literature
| S-EPMC9280023 | biostudies-literature
| S-EPMC6397893 | biostudies-literature
| S-EPMC6361788 | biostudies-literature
| S-EPMC4827848 | biostudies-literature
| S-EPMC2768986 | biostudies-literature
| S-EPMC6138643 | biostudies-literature
| S-EPMC9938619 | biostudies-literature
| S-EPMC6445151 | biostudies-literature