Dataset Information

A computationally efficient modular optimal discovery procedure.

ABSTRACT: It is well known that patterns of differential gene expression across biological conditions are often shared by many genes, particularly those within functional groups. Taking advantage of these patterns can lead to increased statistical power and biological clarity when testing for differential expression in a microarray experiment. The optimal discovery procedure (ODP), which maximizes the expected number of true positives for each fixed number of expected false positives, is a framework aimed at this goal. Storey et al. introduced an estimator of the ODP for identifying differentially expressed genes. However, their ODP estimator grows quadratically in computational time with respect to the number of genes. Reducing this computational burden is a key step in making the ODP practical for usage in a variety of high-throughput problems.Here, we propose a new estimate of the ODP called the modular ODP (mODP). The existing 'full ODP' requires that the likelihood function for each gene be evaluated according to the parameter estimates for all genes. The mODP assigns genes to modules according to a Kullback-Leibler distance, and then evaluates the statistic only at the module-averaged parameter estimates. We show that the mODP is relatively insensitive to the choice of the number of modules, but dramatically reduces the computational complexity from quadratic to linear in the number of genes. We compare the full ODP algorithm and mODP on simulated data and gene expression data from a recent study of Morrocan Amazighs. The mODP and full ODP algorithm perform very similarly across a range of comparisons.The mODP methodology has been implemented into EDGE, a comprehensive gene expression analysis software package in R, available at http://genomine.org/edge/.

SUBMITTER: Woo S

PROVIDER: S-EPMC3105483 | biostudies-other | 2011 Feb

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

A computationally efficient modular optimal discovery procedure.

Woo Sangsoon S Leek Jeffrey T JT Storey John D JD

Bioinformatics (Oxford, England) 20101224 4

<h4>Motivation</h4>It is well known that patterns of differential gene expression across biological conditions are often shared by many genes, particularly those within functional groups. Taking advantage of these patterns can lead to increased statistical power and biological clarity when testing for differential expression in a microarray experiment. The optimal discovery procedure (ODP), which maximizes the expected number of true positives for each fixed number of expected false positives, i ...[more]

PMID: 21186247

Dataset Information

A computationally efficient modular optimal discovery procedure.

Publications

A computationally efficient modular optimal discovery procedure.

OmicsDI is part of the ELIXIR infrastructure

Tweets