Browse
Submit Data
Databases
API
Help

Dataset Information

54 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

The Joint Null Criterion for Multiple Hypothesis Tests

ABSTRACT: Simultaneously performing many hypothesis tests is a problem commonly encountered in high-dimensional biology. In this setting, a large set of p-values is calculated from many related features measured simultaneously. Classical statistics provides a criterion for defining what a “correct” p-value is when performing a single hypothesis test. We show here that even when each p-value is marginally correct under this single hypothesis criterion, it may be the case that the joint behavior of the entire set of p-values is problematic. On the other hand, there are cases where each p-value is marginally incorrect, yet the joint distribution of the set of p-values is satisfactory. Here, we propose a criterion defining a well behaved set of simultaneously calculated p-values that provides precise control of common error rates and we introduce diagnostic procedures for assessing whether the criterion is satisfied with simulations. Multiple testing p-values that satisfy our new criterion avoid potentially large study specific errors, but also satisfy the usual assumptions for strong control of false discovery rates and family-wise error rates. We utilize the new criterion and proposed diagnostics to investigate two common issues in high-dimensional multiple testing for genomics: dependent multiple hypothesis tests and pooled versus test-specific null distributions.

SUBMITTER: Leek J

PROVIDER: S-EPMC3135422 | biostudies-literature | 2011 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Json Xml

Similar Datasets

Setting an optimal α that minimizes errors in null hypothesis significance tests.

Project description:Null hypothesis significance testing has been under attack in recent years, partly owing to the arbitrary nature of setting α (the decision-making threshold and probability of Type I error) at a constant value, usually 0.05. If the goal of null hypothesis testing is to present conclusions in which we have the highest possible confidence, then the only logical decision-making threshold is the value that minimizes the probability (or occasionally, cost) of making errors. Setting α to minimize the combination of Type I and Type II error at a critical effect size can easily be accomplished for traditional statistical tests by calculating the α associated with the minimum average of α and β at the critical effect size. This technique also has the flexibility to incorporate prior probabilities of null and alternate hypotheses and/or relative costs of Type I and Type II errors, if known. Using an optimal α results in stronger scientific inferences because it estimates and minimizes both Type I errors and relevant Type II errors for a test. It also results in greater transparency concerning assumptions about relevant effect size(s) and the relative costs of Type I and II errors. By contrast, the use of α = 0.05 results in arbitrary decisions about what effect sizes will likely be considered significant, if real, and results in arbitrary amounts of Type II error for meaningful potential effect sizes. We cannot identify a rationale for continuing to arbitrarily use α = 0.05 for null hypothesis significance tests in any field, when it is possible to determine an optimal α.

| S-EPMC3289673 | biostudies-literature

Hypothesis tests.

Project description: Not available

| S-EPMC7807926 | biostudies-literature

Multiplicity-calibrated Bayesian hypothesis tests.

Project description:When testing multiple hypotheses simultaneously, there is a need to adjust the levels of the individual tests to effect control of the family-wise error rate (FWER). Standard frequentist adjustments control the error rate but are typically both conservative and oblivious to prior information. We propose a Bayesian testing approach-multiplicity-calibrated Bayesian hypothesis testing-that sets individual critical values to reflect prior information while controlling the FWER via the Bonferroni inequality. If the prior information is specified correctly, in the sense that those null hypotheses considered most likely to be false in fact are false, the power of our method is substantially greater than that of standard frequentist approaches. We illustrate our method using data from a pharmacogenetic trial and a preclinical cancer study. We demonstrate its error rate control and power advantage by simulation.

| S-EPMC2912702 | biostudies-literature

An omnibus test for the global null hypothesis.

Project description:Global hypothesis tests are a useful tool in the context of clinical trials, genetic studies, or meta-analyses, when researchers are not interested in testing individual hypotheses, but in testing whether none of the hypotheses is false. There are several possibilities how to test the global null hypothesis when the individual null hypotheses are independent. If it is assumed that many of the individual null hypotheses are false, combination tests have been recommended to maximize power. If, however, it is assumed that only one or a few null hypotheses are false, global tests based on individual test statistics are more powerful (e.g. Bonferroni or Simes test). However, usually there is no a priori knowledge on the number of false individual null hypotheses. We therefore propose an omnibus test based on cumulative sums of the transformed p-values. We show that this test yields an impressive overall performance. The proposed method is implemented in an R-package called omnibus.

| S-EPMC6676337 | biostudies-literature

Empirical Bayes factors for common hypothesis tests.

Project description:Bayes factors for composite hypotheses have difficulty in encoding vague prior knowledge, as improper priors cannot be used and objective priors may be subjectively unreasonable. To address these issues I revisit the posterior Bayes factor, in which the posterior distribution from the data at hand is re-used in the Bayes factor for the same data. I argue that this is biased when calibrated against proper Bayes factors, but propose adjustments to allow interpretation on the same scale. In the important case of a regular normal model, the bias in log scale is half the number of parameters. The resulting empirical Bayes factor is closely related to the widely applicable information criterion. I develop test-based empirical Bayes factors for several standard tests and propose an extension to multiple testing closely related to the optimal discovery procedure. When only a P-value is available, an approximate empirical Bayes factor is 10p. I propose interpreting the strength of Bayes factors on a logarithmic scale with base 3.73, reflecting the sharpest distinction between weaker and stronger belief. This provides an objective framework for interpreting statistical evidence, and realises a Bayesian/frequentist compromise.

| S-EPMC10883543 | biostudies-literature

Testing a global null hypothesis using ensemble machine learning methods.

Project description:Testing a global null hypothesis that there are no significant predictors for a binary outcome of interest among a large set of biomarker measurements is an important task in biomedical studies. We seek to improve the power of such testing methods by leveraging ensemble machine learning methods. Ensemble machine learning methods such as random forest, bagging, and adaptive boosting model the relationship between the outcome and the predictor nonparametrically, while stacking combines the strength of multiple learners. We demonstrate the power of the proposed testing methods through Monte Carlo studies and show the use of the methods by applying them to the immunologic biomarkers dataset from the RV144 HIV vaccine efficacy trial.

| S-EPMC9035066 | biostudies-literature

matchRanges: generating null hypothesis genomic ranges via covariate-matched sampling.

Project description:MotivationDeriving biological insights from genomic data commonly requires comparing attributes of selected genomic loci to a null set of loci. The selection of this null set is non-trivial, as it requires careful consideration of potential covariates, a problem that is exacerbated by the non-uniform distribution of genomic features including genes, enhancers, and transcription factor binding sites. Propensity score-based covariate matching methods allow the selection of null sets from a pool of possible items while controlling for multiple covariates; however, existing packages do not operate on genomic data classes and can be slow for large data sets making them difficult to integrate into genomic workflows.ResultsTo address this, we developed matchRanges, a propensity score-based covariate matching method for the efficient and convenient generation of matched null ranges from a set of background ranges within the Bioconductor framework.Availability and implementationPackage: https://bioconductor.org/packages/nullranges, Code: https://github.com/nullranges, Documentation: https://nullranges.github.io/nullranges.

| S-EPMC10168584 | biostudies-literature

Generating realistic null hypothesis of cancer mutational landscapes using SigProfilerSimulator.

Project description:BackgroundPerforming a statistical test requires a null hypothesis. In cancer genomics, a key challenge is the fast generation of accurate somatic mutational landscapes that can be used as a realistic null hypothesis for making biological discoveries.ResultsHere we present SigProfilerSimulator, a powerful tool that is capable of simulating the mutational landscapes of thousands of cancer genomes at different resolutions within seconds. Applying SigProfilerSimulator to 2144 whole-genome sequenced cancers reveals: (i) that most doublet base substitutions are not due to two adjacent single base substitutions but likely occur as single genomic events; (ii) that an extended sequencing context of ± 2 bp is required to more completely capture the patterns of substitution mutational signatures in human cancer; (iii) information on false-positive discovery rate of commonly used bioinformatics tools for detecting driver genes.ConclusionsSigProfilerSimulator's breadth of features allows one to construct a tailored null hypothesis and use it for evaluating the accuracy of other bioinformatics tools or for downstream statistical analysis for biological discoveries. SigProfilerSimulator is freely available at https://github.com/AlexandrovLab/SigProfilerSimulator with an extensive documentation at https://osf.io/usxjz/wiki/home/ .

| S-EPMC7539472 | biostudies-literature

Revisiting a Null Hypothesis: Exploring the Parameters of Oligometastasis Treatment.

Project description:PurposeIn the treatment of patients with metastatic cancer, the current paradigm states that metastasis-directed therapy does not prolong life. This paradigm forms the basis of clinical trial null hypotheses, where trials are built to test the null hypothesis that patients garner no overall survival benefit from targeting metastatic lesions. However, with advancing imaging technology and increasingly precise techniques for targeting lesions, a much larger proportion of metastatic disease can be treated. As a result, the life-extending benefit of targeting metastatic disease is becoming increasingly clear.Methods and materialsIn this work, we suggest shifting this qualitative null hypothesis and describe a mathematical model that can be used to frame a new, quantitative null. We begin with a very simple formulation of tumor growth, an exponential function, and illustrate how the same intervention (removing a given number of cells from the tumor) at different times affects survival. Additionally, we postulate where recent clinical trials fit into this parameter space and discuss the implications of clinical trial design in changing these quantitative parameters.ResultsOur model shows that although any amount of cell kill will extend survival, in many cases the extent is so small as to be unnoticeable in a clinical context or is outweighed by factors related to toxicity and treatment time.ConclusionsRecasting the null in these quantitative terms will allow trialists to design trials specifically to increase understanding of the circumstances (patient selection, disease burden, tumor growth kinetics) that can lead to improved overall survival when targeting metastatic lesions, rather than whether targeting metastases extends survival for patients with (oligo-) metastatic disease.

| S-EPMC8122026 | biostudies-literature

Efficient alternatives for Bayesian hypothesis tests in psychology.

Project description:Bayesian hypothesis testing procedures have gained increased acceptance in recent years. A key advantage that Bayesian tests have over classical testing procedures is their potential to quantify information in support of true null hypotheses. Ironically, default implementations of Bayesian tests prevent the accumulation of strong evidence in favor of true null hypotheses because associated default alternative hypotheses assign a high probability to data that are most consistent with a null effect. We propose the use of "nonlocal" alternative hypotheses to resolve this paradox. The resulting class of Bayesian hypothesis tests permits more rapid accumulation of evidence in favor of both true null hypotheses and alternative hypotheses that are compatible with standardized effect sizes of most interest in psychology. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

| S-EPMC9561355 | biostudies-literature

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data