Dataset Information

Combining controls can improve power in two-stage association studies.

ABSTRACT:

Background

High dimensional case control studies are ubiquitous in the biological sciences, particularly genomics. To maximise power while constraining cost and to minimise type-1 error rates, researchers typically seek to replicate findings in a second experiment on independent cohorts before proceeding with further analyses. This can be an expensive procedure, particularly when control samples are difficult to recruit or ascertain; for example in inter-disease comparisons, or studies on degenerative diseases.

Results

This paper presents a method in which control (or case) samples from the discovery cohort are re-used in a replication study. The theoretical implications of this method are discussed and simulated genome-wide association study (GWAS) tests are used to compare performance against the standard approach in a range of circumstances. Using similar methods, a procedure is proposed for 'partial replication' using a new independent cohort consisting of only controls. This methods can be used to provide some validation of findings when a full replication procedure is not possible. The new method has differing sensitivity to confounding in study cohorts compared to the standard procedure, which must be considered in its application. Type-1 error rates in these scenarios are analytically and empirically derived, and an online tool for comparing power and error rates is provided.

Conclusions

In several common study designs, a shared-control method allows a substantial improvement in power while retaining type-1 error rate control. Although careful consideration must be made of all necessary assumptions, this method can enable more efficient use of data in GWAS and other applications.

SUBMITTER: Liley J

PROVIDER: S-EPMC6171163 | biostudies-literature | 2018 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Combining controls can improve power in two-stage association studies.

Liley James J

BMC genetics 20181003 1

<h4>Background</h4>High dimensional case control studies are ubiquitous in the biological sciences, particularly genomics. To maximise power while constraining cost and to minimise type-1 error rates, researchers typically seek to replicate findings in a second experiment on independent cohorts before proceeding with further analyses. This can be an expensive procedure, particularly when control samples are difficult to recruit or ascertain; for example in inter-disease comparisons, or studies o ...[more]

PMID: 30285617

Dataset Information

Combining controls can improve power in two-stage association studies.

Background

Results

Conclusions

Publications

Combining controls can improve power in two-stage association studies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Loss of power in two-stage residual-outcome regression analysis in genetic association studies.
| S-EPMC4350584 | biostudies-literature

Multiethnic genetic association studies improve power for locus discovery.
| S-EPMC2935880 | biostudies-literature

Optimal 2-stage design with given power in association studies.
| S-EPMC2648901 | biostudies-literature

PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies.
| S-EPMC6915826 | biostudies-literature

Evaluating the Potential of Younger Cases and Older Controls Cohorts to Improve Discovery Power in Genome-Wide Association Studies of Late-Onset Diseases.
| S-EPMC6789773 | biostudies-literature

Improved minimum cost and maximum power two stage genome-wide association study designs.
| S-EPMC3435377 | biostudies-literature

Admixed Populations Improve Power for Variant Discovery and Portability in Genome-Wide Association Studies.
| S-EPMC8181458 | biostudies-literature

Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies.
| S-EPMC5125008 | biostudies-literature

Imputation-Aware Tag SNP Selection To Improve Power for Large-Scale, Multi-ethnic Association Studies.
| S-EPMC6169386 | biostudies-other

Optimal DNA pooling-based two-stage designs in case-control association studies.
| S-EPMC2868915 | biostudies-literature