Dataset Information

Tracing sub-structure in the European American population with PCA-informative markers.

ABSTRACT: Genetic structure in the European American population reflects waves of migration and recent gene flow among different populations. This complex structure can introduce bias in genetic association studies. Using Principal Components Analysis (PCA), we analyze the structure of two independent European American datasets (1,521 individuals-307,315 autosomal SNPs). Individual variation lies across a continuum with some individuals showing high degrees of admixture with non-European populations, as demonstrated through joint analysis with HapMap data. The CEPH Europeans only represent a small fraction of the variation encountered in the larger European American datasets we studied. We interpret the first eigenvector of this data as correlated with ancestry, and we apply an algorithm that we have previously described to select PCA-informative markers (PCAIMs) that can reproduce this structure. Importantly, we develop a novel method that can remove redundancy from the selected SNP panels and show that we can effectively remove correlated markers, thus increasing genotyping savings. Only 150-200 PCAIMs suffice to accurately predict fine structure in European American datasets, as identified by PCA. Simulating association studies, we couple our method with a PCA-based stratification correction tool and demonstrate that a small number of PCAIMs can efficiently remove false correlations with almost no loss in power. The structure informative SNPs that we propose are an important resource for genetic association studies of European Americans. Furthermore, our redundancy removal algorithm can be applied on sets of ancestry informative markers selected with any method in order to select the most uncorrelated SNPs, and significantly decreases genotyping costs.

SUBMITTER: Paschou P

PROVIDER: S-EPMC2537989 | biostudies-literature | 2008 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Tracing sub-structure in the European American population with PCA-informative markers.

Paschou Peristera P Drineas Petros P Lewis Jamey J Nievergelt Caroline M CM Nickerson Deborah A DA Smith Joshua D JD Ridker Paul M PM Chasman Daniel I DI Krauss Ronald M RM Ziv Elad E

PLoS genetics 20080704 7

Genetic structure in the European American population reflects waves of migration and recent gene flow among different populations. This complex structure can introduce bias in genetic association studies. Using Principal Components Analysis (PCA), we analyze the structure of two independent European American datasets (1,521 individuals-307,315 autosomal SNPs). Individual variation lies across a continuum with some individuals showing high degrees of admixture with non-European populations, as d ...[more]

PMID: 18797516

Dataset Information

Tracing sub-structure in the European American population with PCA-informative markers.

Publications

Tracing sub-structure in the European American population with PCA-informative markers.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

European population genetic substructure: further definition of ancestry informative markers for distinguishing among diverse European ethnic groups.
| S-EPMC2730349 | biostudies-literature

Using ancestry-informative markers to identify fine structure across 15 populations of European origin.
| S-EPMC4169539 | biostudies-literature

Ancient Ancestry Informative Markers for Identifying Fine-Scale Ancient Population Structure in Eurasians.
| S-EPMC6316245 | biostudies-literature

Consequences of PCA graphs, SNP codings, and PCA variants for elucidating population structure.
| S-EPMC6581268 | biostudies-literature

A minimum set of ancestry informative markers for determining admixture proportions in a mixed American population: the Brazilian set.
| S-EPMC4930091 | biostudies-literature

Peptide Ancestry Informative Markers in Uterine Neoplasms from Women of European, African and Asian Ancestry
2022-02-16 | PXD029323 | Pride

Genome-Wide Informative Microsatellite Markers and Population Structure of <i>Fusarium virguliforme</i> from Argentina and the USA.
| S-EPMC10672573 | biostudies-literature

Detecting inversions with PCA in the presence of population structure.
| S-EPMC7595445 | biostudies-literature

Peptide ancestry informative markers in uterine neoplasms from women of European, African, and Asian ancestry.
| S-EPMC8753123 | biostudies-literature

Gallbladder Cancer Risk and Indigenous South American Mapuche Ancestry: Instrumental Variable Analysis Using Ancestry-Informative Markers.
| S-EPMC10452561 | biostudies-literature