Unknown

Dataset Information

0

Cross-species queries of large gene expression databases.


ABSTRACT:

Motivation

Expression databases, including the Gene Expression Omnibus and ArrayExpress, have experienced significant growth over the past decade and now hold hundreds of thousands of arrays from multiple species. Since most drugs are initially tested on model organisms, the ability to compare expression experiments across species may help identify pathways that are activated in a similar way in humans and other organisms. However, while several methods exist for finding co-expressed genes in the same species as a query gene, looking at co-expression of homologs or arbitrary genes in other species is challenging. Unlike sequence, which is static, expression is dynamic and changes between tissues, conditions and time. Thus, to carry out cross-species analysis using these databases, we need methods that can match experiments in one species with experiments in another species.

Results

To facilitate queries in large databases, we developed a new method for comparing expression experiments from different species. We define a distance metric between the ranking of orthologous genes in the two species. We show how to solve an optimization problem for learning the parameters of this function using a training dataset of known similar expression experiments pairs. The function we learn outperforms previous methods and simpler rank comparison methods that have been used in the past for single species analysis. We used our method to compare millions of array pairs from mouse and human expression experiments. The resulting matches can be used to find functionally related genes, to hypothesize about biological response mechanisms and to highlight conditions and diseases that are activating similar pathways in both species.

Availability

Supporting methods, results and a Matlab implementation are available from http://sb.cs.cmu.edu/ExpQ/.

SUBMITTER: Le HS 

PROVIDER: S-EPMC2944203 | biostudies-literature | 2010 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Cross-species queries of large gene expression databases.

Le Hai-Son HS   Oltvai Zoltán N ZN   Bar-Joseph Ziv Z  

Bioinformatics (Oxford, England) 20100811 19


<h4>Motivation</h4>Expression databases, including the Gene Expression Omnibus and ArrayExpress, have experienced significant growth over the past decade and now hold hundreds of thousands of arrays from multiple species. Since most drugs are initially tested on model organisms, the ability to compare expression experiments across species may help identify pathways that are activated in a similar way in humans and other organisms. However, while several methods exist for finding co-expressed gen  ...[more]

Similar Datasets

| S-EPMC7375414 | biostudies-literature
| S-EPMC6836710 | biostudies-literature
| PRJNA146911 | ENA
| S-EPMC6153358 | biostudies-literature
| S-EPMC29756 | biostudies-literature
| S-EPMC2719781 | biostudies-literature
2011-12-31 | GSE32679 | GEO
| S-EPMC10643690 | biostudies-literature
| S-EPMC8628444 | biostudies-literature
2011-12-31 | E-GEOD-32679 | biostudies-arrayexpress