Dataset Information

Effects of dependence in high-dimensional multiple testing problems.

ABSTRACT:

Background

We consider effects of dependence among variables of high-dimensional data in multiple hypothesis testing problems, in particular the False Discovery Rate (FDR) control procedures. Recent simulation studies consider only simple correlation structures among variables, which is hardly inspired by real data features. Our aim is to systematically study effects of several network features like sparsity and correlation strength by imposing dependence structures among variables using random correlation matrices.

Results

We study the robustness against dependence of several FDR procedures that are popular in microarray studies, such as Benjamin-Hochberg FDR, Storey's q-value, SAM and resampling based FDR procedures. False Non-discovery Rates and estimates of the number of null hypotheses are computed from those methods and compared. Our simulation study shows that methods such as SAM and the q-value do not adequately control the FDR to the level claimed under dependence conditions. On the other hand, the adaptive Benjamini-Hochberg procedure seems to be most robust while remaining conservative. Finally, the estimates of the number of true null hypotheses under various dependence conditions are variable.

Conclusion

We discuss a new method for efficient guided simulation of dependent data, which satisfy imposed network constraints as conditional independence structures. Our simulation set-up allows for a structural study of the effect of dependencies on multiple testing criterions and is useful for testing a potentially new method on pi0 or FDR estimation in a dependency context.

SUBMITTER: Kim KI

PROVIDER: S-EPMC2375137 | biostudies-literature | 2008 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Effects of dependence in high-dimensional multiple testing problems.

Kim Kyung In KI van de Wiel Mark A MA

BMC bioinformatics 20080225

<h4>Background</h4>We consider effects of dependence among variables of high-dimensional data in multiple hypothesis testing problems, in particular the False Discovery Rate (FDR) control procedures. Recent simulation studies consider only simple correlation structures among variables, which is hardly inspired by real data features. Our aim is to systematically study effects of several network features like sparsity and correlation strength by imposing dependence structures among variables using ...[more]

PMID: 18298808

Dataset Information

Effects of dependence in high-dimensional multiple testing problems.

Background

Results

Conclusion

Publications

Effects of dependence in high-dimensional multiple testing problems.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A multiple-testing procedure for high-dimensional mediation hypotheses.
| S-EPMC8991388 | biostudies-literature

Testing Mediation Effects in High-Dimensional Epigenetic Studies.
| S-EPMC6883258 | biostudies-literature

Integrative High Dimensional Multiple Testing with Heterogeneity under Data Sharing Constraints.
| S-EPMC10327421 | biostudies-literature

A general framework for multiple testing dependence.
| S-EPMC2586646 | biostudies-literature

Multiple Testing under Dependence via Semiparametric Graphical Models.
| S-EPMC4190841 | biostudies-literature

Post hoc power estimation in large-scale multiple testing problems.
| S-EPMC3500624 | biostudies-literature

Rank Conditional Coverage and Confidence Intervals in High-Dimensional Problems.
| S-EPMC6364309 | biostudies-literature

ASYMPTOTICALLY INDEPENDENT U-STATISTICS IN HIGH-DIMENSIONAL TESTING.
| S-EPMC8634550 | biostudies-literature

HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION.
| S-EPMC4522432 | biostudies-literature

Designing penalty functions in high dimensional problems: The role of tuning parameters.
| S-EPMC5628772 | biostudies-literature