Population-specific long-range linkage disequilibrium in the human genome and its influence on identifying common disease variants.
Ontology highlight
ABSTRACT: Despite the availability of large-scale sequencing data, long-range linkage disequilibrium (LRLD) has not been extensively studied. The theoretical aspects of LRLD estimates were studied to determine the best estimation method for the sequencing data of three different populations of African (AFR), European (EUR), and East-Asian (EAS) descent from the 1000 Genomes Project. Genome-wide LRLDs excluding centromeric regions revealed clear population specificity, presenting substantially more population-specific LRLDs than coincident LRLDs. Clear relationships between the functionalities of the regions in LRLDs denoted long-range interactions in the genome. The proportions of gene regions were increased in LRLD variants, and the coding sequence (CDS)-CDS LRLDs showed obvious functional similarities between genes in LRLDs. Application to theoretical case-control associations confirmed that the LRLDs in genome-wide association studies (GWASs) could contribute to false signals, although the impacts might not be severe in most cases. LRLDs with variants with functional similarity exist in the human genome indicating possible gene-gene interactions, and they differ depending on populations. Based on the current study, LRLDs should be examined in GWASs to identify true signals. More importantly, population specificity in LRLDs should be examined in relevant studies.
SUBMITTER: Park L
PROVIDER: S-EPMC6684625 | biostudies-literature | 2019 Aug
REPOSITORIES: biostudies-literature
ACCESS DATA