Population genetic properties of differentiated human copy number polymorphisms
Ontology highlight
ABSTRACT: Copy number variants (CNVs) can reach appreciable frequencies in the human population, and several of these copy number polymorphisms (CNPs) have been recently associated with human diseases including lupus, psoriasis, Crohn disease, and obesity. Despite new advances, significant biases remain in terms of CNP discovery and genotyping. Developing a novel method based on single channel intensity data and benchmarking against copy numbers determined from sequencing read-depth, we successfully obtained CNP genotypes for 1489 CNPs from 487 human DNA samples from diverse ethnic backgrounds. This customized microarray was enriched for segmental duplication-rich regions and novel insertions of sequences not represented in the reference genome assembly or on standard single nucleotide polymorphism (SNP) microarray platforms. We observe that CNPs in segmental duplications are more likely to be population differentiated than CNPs in unique regions (p = 0.015) and that bi-allelic CNPs show greater stratification when compared to frequency-matched SNPs (p = 0.0026). Although bi-allelic CNPs show a strong correlation of copy number with flanking SNP genotypes, the majority of multi-copy CNPs do not (40% with r >0.8). We selected a subset of CNPs for further characterization in 1873 additional samples from 62 populations (947 samples analyzed by microarray; 926 samples analyzed with PCR based assays); this revealed striking population-differentiated structural variants in genes of clinical significance such as the OCLN gene, a tight junction protein involved in hepatitis C viral entry. Our new microarray design allows these variants to be rapidly tested for disease association and our results suggest that CNPs (especially those that are not in linkage disequilibrium with SNPs) may have contributed disproportionately to human diversity and selection.
ORGANISM(S): Homo sapiens
PROVIDER: GSE26450 | GEO | 2011/02/01
SECONDARY ACCESSION(S): PRJNA136553
REPOSITORIES: GEO
ACCESS DATA