Unknown

Dataset Information

0

Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma.


ABSTRACT:

Motivation

Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biological insights.

Results

We propose an information retrieval method based on differential expression. Our method deals with arbitrary experimental designs and performs competitively with alternative approaches, while making the search results interpretable in terms of differential expression patterns. We show that our model yields meaningful connections between biological conditions from different studies. Finally, we validate a previously unknown connection between malignant pleural mesothelioma and SIM2s suggested by our method, via real-time polymerase chain reaction in an independent set of mesothelioma samples.

Availability

Supplementary data and source code are available from http://www.ebi.ac.uk/fg/research/rex.

SUBMITTER: Caldas J 

PROVIDER: S-EPMC3259436 | biostudies-literature | 2012 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma.

Caldas José J   Gehlenborg Nils N   Kettunen Eeva E   Faisal Ali A   Rönty Mikko M   Nicholson Andrew G AG   Knuutila Sakari S   Brazma Alvis A   Kaski Samuel S  

Bioinformatics (Oxford, England) 20111120 2


<h4>Motivation</h4>Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biol  ...[more]

Similar Datasets

2005-06-01 | GSE2549 | GEO
| S-EPMC7047444 | biostudies-literature
| S-EPMC8836658 | biostudies-literature
| S-EPMC6032160 | biostudies-literature
| S-EPMC6198256 | biostudies-literature
| S-EPMC8200040 | biostudies-literature
| S-EPMC5661117 | biostudies-literature
| S-EPMC5394041 | biostudies-literature
| S-EPMC8360519 | biostudies-literature
| S-EPMC5504117 | biostudies-literature