Dataset Information

Combining evidence of preferential gene-tissue relationships from multiple sources.

ABSTRACT: An important challenge in drug discovery and disease prognosis is to predict genes that are preferentially expressed in one or a few tissues, i.e. showing a considerably higher expression in one tissue(s) compared to the others. Although several data sources and methods have been published explicitly for this purpose, they often disagree and it is not evident how to retrieve these genes and how to distinguish true biological findings from those that are due to choice-of-method and/or experimental settings. In this work we have developed a computational approach that combines results from multiple methods and datasets with the aim to eliminate method/study-specific biases and to improve the predictability of preferentially expressed human genes. A rule-based score is used to merge and assign support to the results. Five sets of genes with known tissue specificity were used for parameter pruning and cross-validation. In total we identify 3434 tissue-specific genes. We compare the genes of highest scores with the public databases: PaGenBase (microarray), TiGER (EST) and HPA (protein expression data). The results have 85% overlap to PaGenBase, 71% to TiGER and only 28% to HPA. 99% of our predictions have support from at least one of these databases. Our approach also performs better than any of the databases on identifying drug targets and biomarkers with known tissue-specificity.

SUBMITTER: Guo J

PROVIDER: S-EPMC3741196 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Combining evidence of preferential gene-tissue relationships from multiple sources.

Guo Jing J Hammar Mårten M Oberg Lisa L Padmanabhuni Shanmukha S SS Bjäreland Marcus M Dalevi Daniel D

PloS one 20130812 8

An important challenge in drug discovery and disease prognosis is to predict genes that are preferentially expressed in one or a few tissues, i.e. showing a considerably higher expression in one tissue(s) compared to the others. Although several data sources and methods have been published explicitly for this purpose, they often disagree and it is not evident how to retrieve these genes and how to distinguish true biological findings from those that are due to choice-of-method and/or experimenta ...[more]

PMID: 23950964

Dataset Information

Combining evidence of preferential gene-tissue relationships from multiple sources.

Publications

Combining evidence of preferential gene-tissue relationships from multiple sources.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Combining Multiple Observational Data Sources to Estimate Causal Effects.
| S-EPMC7571608 | biostudies-literature

A Simple Scalable Association Hypothesis Test Combining Gene-wide Evidence From Multiple Polymorphisms.
| S-EPMC3969754 | biostudies-literature

A new estimation approach for combining epidemiological data from multiple sources.
| S-EPMC3964681 | biostudies-literature

A BAYESIAN HIERARCHICAL MODEL FOR COMBINING MULTIPLE DATA SOURCES IN POPULATION SIZE ESTIMATION.
| S-EPMC10150643 | biostudies-literature

Simultaneous comparison of multiple treatments: combining direct and indirect evidence.
| S-EPMC1255806 | biostudies-literature

A Bayesian approach to combining multiple information sources: Estimating and forecasting childhood obesity in Thailand.
| S-EPMC8782526 | biostudies-literature

A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL.
| S-EPMC7324951 | biostudies-literature

Integrative genomics: quantifying significance of phenotype-genotype relationships from multiple sources of high-throughput data.
| S-EPMC3668276 | biostudies-literature

In silico gene prioritization by integrating multiple data sources.
| S-EPMC3123338 | biostudies-literature

Protein complex identification by integrating protein-protein interaction evidence from multiple sources.
| S-EPMC3873956 | biostudies-literature