Unknown

Dataset Information

0

The cost of large numbers of hypothesis tests on power, effect size and sample size.


ABSTRACT: Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing.

SUBMITTER: Lazzeroni LC 

PROVIDER: S-EPMC3252610 | biostudies-literature | 2012 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

The cost of large numbers of hypothesis tests on power, effect size and sample size.

Lazzeroni L C LC   Ray A A  

Molecular psychiatry 20101109 1


Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have sugges  ...[more]

Similar Datasets

| S-EPMC6736231 | biostudies-literature
| S-EPMC5012670 | biostudies-literature
| S-EPMC5435249 | biostudies-literature
| S-EPMC10134629 | biostudies-literature
| S-EPMC8441096 | biostudies-literature
| S-EPMC9325423 | biostudies-literature
| S-EPMC4069038 | biostudies-literature
| S-EPMC2837028 | biostudies-literature
| S-EPMC4249707 | biostudies-literature
| S-EPMC7904226 | biostudies-literature