Unknown

Dataset Information

0

A Unifying Framework for Imputing Summary Statistics in Genome-Wide Association Studies.


ABSTRACT: Methods to impute missing data are routinely used to increase power in genome-wide association studies. There are two broad classes of imputation methods. The first class imputes genotypes at the untyped variants, given those at the typed variants, and then performs a statistical test of association at the imputed variants. The second class, summary statistic imputation (SSI), directly imputes association statistics at the untyped variants, given the association statistics observed at the typed variants. The second class is appealing as it tends to be computationally efficient while only requiring the summary statistics from a study, while the former class requires access to individual-level data that can be difficult to obtain. The statistical properties of these two classes of imputation methods have not been fully understood. In this study, we show that the two classes of imputation methods yield association statistics with similar distributions for sufficiently large sample sizes. Using this relationship, we can understand the effect of the imputation method on power. We show that a commonly used approach to SSI that we term SSI with variance reweighting generally leads to a loss in power. On the contrary, our proposed method for SSI that does not perform variance reweighting fully accounts for imputation uncertainty, while achieving better power.

SUBMITTER: Wu Y 

PROVIDER: S-EPMC7081249 | biostudies-literature | 2020 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Unifying Framework for Imputing Summary Statistics in Genome-Wide Association Studies.

Wu Yue Y   Eskin Eleazar E   Sankararaman Sriram S  

Journal of computational biology : a journal of computational molecular cell biology 20200213 3


Methods to impute missing data are routinely used to increase power in genome-wide association studies. There are two broad classes of imputation methods. The first class imputes genotypes at the untyped variants, given those at the typed variants, and then performs a statistical test of association at the imputed variants. The second class, summary statistic imputation (SSI), directly imputes association statistics at the untyped variants, given the association statistics observed at the typed  ...[more]

Similar Datasets

| S-EPMC5836736 | biostudies-literature
| S-EPMC5743780 | biostudies-literature
| S-EPMC6239891 | biostudies-literature
| S-EPMC8247874 | biostudies-literature
| S-EPMC8237646 | biostudies-literature
| S-EPMC5005435 | biostudies-literature
| S-EPMC5796536 | biostudies-literature
| S-EPMC6343668 | biostudies-literature
| S-EPMC5345724 | biostudies-literature
| S-EPMC8414872 | biostudies-literature