Unknown

Dataset Information

0

A new symbolic representation for the identification of informative genes in replicated microarray experiments.


ABSTRACT: Microarray experiments generate massive amounts of data, necessitating innovative algorithms to distinguish biologically relevant information from noise. Because the variability of gene expression data is an important factor in determining which genes are differentially expressed, analysis techniques that take into account repeated measurements are critically important. Additionally, the selection of informative genes is typically done by searching for the individual genes that vary the most across conditions. Yet because genes tend to act in groups rather than individually, it may be possible to glean more information from the data by searching specifically for concerted behavior in a set of genes. Applying a symbolic transformation to the gene expression data allows the detection overrepresented patterns in the data, in contrast to looking only for genes that exhibit maximal differential expression. These challenges are approached by introducing an algorithm based on a new symbolic representation that searches for concerted gene expression patterns; furthermore, the symbolic representation takes into account the variance in multiple replicates and can be applied to long time series data. The proposed algorithm's ability to discover biologically relevant signals in gene expression data is exhibited by applying it to three datasets that measure gene expression in the rat liver.

SUBMITTER: Scheff JD 

PROVIDER: S-EPMC3133780 | biostudies-literature | 2010 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

A new symbolic representation for the identification of informative genes in replicated microarray experiments.

Scheff Jeremy D JD   Almon Richard R RR   DuBois Debra C DC   Jusko William J WJ   Androulakis Ioannis P IP  

Omics : a journal of integrative biology 20100601 3


Microarray experiments generate massive amounts of data, necessitating innovative algorithms to distinguish biologically relevant information from noise. Because the variability of gene expression data is an important factor in determining which genes are differentially expressed, analysis techniques that take into account repeated measurements are critically important. Additionally, the selection of informative genes is typically done by searching for the individual genes that vary the most acr  ...[more]

Similar Datasets

| S-EPMC1397872 | biostudies-literature
| S-EPMC3527185 | biostudies-literature
| 2048176 | ecrin-mdr-crc
| S-EPMC5860058 | biostudies-literature
| S-ECPF-GEOD-16984 | biostudies-other
| S-EPMC4690035 | biostudies-literature
| S-EPMC533874 | biostudies-literature
| S-EPMC2409071 | biostudies-literature
| S-EPMC2099446 | biostudies-literature
| S-EPMC3607310 | biostudies-literature