Unknown

Dataset Information

0

Control of population stratification by correlation-selected principal components.


ABSTRACT: In genome-wide association studies, population stratification is recognized as producing inflated type I error due to the inflation of test statistics. Principal component-based methods applied to genotypes provide information about population structure, and have been widely used to control for stratification. Here we explore the precise relationship between genotype principal components and inflation of association test statistics, thereby drawing a connection between principal component-based stratification control and the alternative approach of genomic control. Our results provide an inherent justification for the use of principal components, but call into question the popular practice of selecting principal components based on significance of eigenvalues alone. We propose a new approach, called EigenCorr, which selects principal components based on both their eigenvalues and their correlation with the (disease) phenotype. Our approach tends to select fewer principal components for stratification control than does testing of eigenvalues alone, providing substantial computational savings and improvements in power. Analyses of simulated and real data demonstrate the usefulness of the proposed approach.

SUBMITTER: Lee S 

PROVIDER: S-EPMC3117098 | biostudies-literature | 2011 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Control of population stratification by correlation-selected principal components.

Lee Seunggeun S   Wright Fred A FA   Zou Fei F  

Biometrics 20101206 3


In genome-wide association studies, population stratification is recognized as producing inflated type I error due to the inflation of test statistics. Principal component-based methods applied to genotypes provide information about population structure, and have been widely used to control for stratification. Here we explore the precise relationship between genotype principal components and inflation of association test statistics, thereby drawing a connection between principal component-based  ...[more]

Similar Datasets

| S-EPMC2941459 | biostudies-literature
| S-EPMC3864649 | biostudies-literature
| S-EPMC2764806 | biostudies-literature
| S-EPMC3392282 | biostudies-literature
| S-EPMC3150322 | biostudies-literature
| S-EPMC6475581 | biostudies-literature
| S-EPMC7077175 | biostudies-literature
| S-EPMC7186421 | biostudies-literature
| S-EPMC3032075 | biostudies-literature
| S-EPMC2912642 | biostudies-literature