Dataset Information

Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium.

ABSTRACT:

Background

Measures of linkage disequilibrium (LD) play a key role in a wide range of applications from disease association to demographic history estimation. The true population LD cannot be measured directly and instead can only be inferred from genetic samples, which are unavoidably subject to measurement error. Previous studies of r² (a measure of LD), such as the bias due to finite sample size and its variance, were based on the special case that the true population-wise LD is zero. These results generally do not hold for non-zero [Formula: see text] values, which are more common in real genetic data.

Results

This work generalises the estimation of r² to all levels of LD, and for both phased and unphased data. First, we provide new formulae for the effect of finite sample size on the observed r² values. Second, we find a new empirical formula for the variance of the observed r², equals to 2E[r²](1?-?E[r²])/n, where n is the diploid sample size. Third, we propose a new routine, Constrained ML, a likelihood-based method to directly estimate haplotype frequencies and r² from diploid genotypes under Hardy-Weinberg Equilibrium. While serving the same purpose as the pre-existing Expectation-Maximisation algorithm, the new routine can have better convergence and is simpler to use. A new likelihood-ratio test is also introduced to test for the absence of a particular haplotype. Extensive simulations are run to support these findings.

Conclusion

Most inferences on LD will benefit from our new findings, from point and interval estimation to hypothesis testing. Genetic analyses utilising r² information will become more accurate as a result.

SUBMITTER: Hui TJ

PROVIDER: S-EPMC7045472 | biostudies-literature | 2020 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium.

Hui Tin-Yu J TJ Burt Austin A

BMC genetics 20200226 1

<h4>Background</h4>Measures of linkage disequilibrium (LD) play a key role in a wide range of applications from disease association to demographic history estimation. The true population LD cannot be measured directly and instead can only be inferred from genetic samples, which are unavoidably subject to measurement error. Previous studies of r<sup>2</sup> (a measure of LD), such as the bias due to finite sample size and its variance, were based on the special case that the true population-wise ...[more]

PMID: 32102657

Similar Datasets

Project description:Objective: Departure from Hardy Weinberg Equilibrium (HWE) may occur due to a variety of causes, including purifying selection, inbreeding, population substructure, copy number variation or genotyping error. We searched for specific characteristics of HWE-departure due to genotyping error. Methods: Genotypes of a random set of genetic variants were obtained from the Exome Aggregation Consortium (ExAC) database. Variants with <80% successful genotypes or with minor allele frequency (MAF) <1% were excluded. HWE-departure (d-HWE) was considered significant at p < 10E-05 and classified as d-HWE with loss of heterozygosity (LoH d-HWE) or d-HWE with excess heterozygosity (gain of heterozygosity: GoH d-HWE). Missing genotypes, variant type (single nucleotide polymorphism (SNP) vs. insertion/deletion); MAF, standard deviation (SD) of MAF across populations (MAF-SD) and copy number variation were evaluated for association with HWE-departure. Results: The study sample comprised 3,204 genotype distributions. HWE-departure was observed in 134 variants: LoH d-HWE in 41 (1.3%), GoH d-HWE in 93 (2.9%) variants. LoH d-HWE was more likely in variants located within deletion polymorphisms (p < 0.001) and in variants with higher MAF-SD (p = 0.0077). GoH d-HWE was associated with low genotyping rate, with variants of insertion/deletion type and with high MAF (all at p < 0.001). In a sub-sample of 2,196 variants with genotyping rate >98%, LoH d-HWE was found in 29 (1.3%) variants, but no GoH d-HWE was detected. The findings of the non-random distribution of HWE-violating SNPs along the chromosome, the association with common deletion polymorphisms and indel-variant type, and the finding of excess heterozygotes in genomic regions that are prone to cross-hybridization were confirmed in a large sample of short variants from the 1,000 Genomes Project. Conclusions: We differentiated between two types of HWE-departure. GoH d-HWE was suggestive for genotyping error. LoH d-HWE, on the contrary, pointed to natural variabilities such as population substructure or common deletion polymorphisms.

Dataset Information

Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium.

Background

Results

Conclusion

Publications

Estimating linkage disequilibrium from genotypes under Hardy-Weinberg equilibrium.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets