Project description:The growing availability of high-quality genomic annotation has increased the potential for mechanistic insights when the specific variants driving common genome-wide association signals are accurately localized. A range of fine-mapping strategies have been advocated, and specific successes reported, but the overall performance of such approaches, in the face of the extensive linkage disequilibrium that characterizes the human genome, is not well understood. Using simulations based on sequence data from the 1000 Genomes Project, we quantify the extent to which fine-mapping, here conducted using an approximate Bayesian approach, can be expected to lead to useful improvements in causal variant localization. We show that resolution is highly variable between loci, and that performance is severely degraded as the statistical power to detect association is reduced. We confirm that, where causal variants are shared between ancestry groups, further improvements in performance can be obtained in a trans-ethnic fine-mapping design. Finally, using empirical data from a recently published genome-wide association study for ankylosing spondylitis, we provide empirical confirmation of the behaviour of the approximate Bayesian approach and demonstrate that seven of twenty-six loci can be fine-mapped to fewer than ten variants.
Project description:Inflammatory bowel diseases are chronic gastrointestinal inflammatory disorders that affect millions of people worldwide. Genome-wide association studies have identified 200 inflammatory bowel disease-associated loci, but few have been conclusively resolved to specific functional variants. Here we report fine-mapping of 94 inflammatory bowel disease loci using high-density genotyping in 67,852 individuals. We pinpoint 18 associations to a single causal variant with greater than 95% certainty, and an additional 27 associations to a single variant with greater than 50% certainty. These 45 variants are significantly enriched for protein-coding changes (n?=?13), direct disruption of transcription-factor binding sites (n?=?3), and tissue-specific epigenetic marks (n?=?10), with the last category showing enrichment in specific immune cells among associations stronger in Crohn's disease and in gut mucosa among associations stronger in ulcerative colitis. The results of this study suggest that high-resolution fine-mapping in large samples can convert many discoveries from genome-wide association studies into statistically convincing causal variants, providing a powerful substrate for experimental elucidation of disease mechanisms.
Project description:Aims/hypothesisType 2 diabetes is a growing global public health challenge. Investigating quantitative traits, including fasting glucose, fasting insulin and HbA1c, that serve as early markers of type 2 diabetes progression may lead to a deeper understanding of the genetic aetiology of type 2 diabetes development. Previous genome-wide association studies (GWAS) have identified over 500 loci associated with type 2 diabetes, glycaemic traits and insulin-related traits. However, most of these findings were based only on populations of European ancestry. To address this research gap, we examined the genetic basis of fasting glucose, fasting insulin and HbA1c in participants of the diverse Population Architecture using Genomics and Epidemiology (PAGE) Study.MethodsWe conducted a GWAS of fasting glucose (n = 52,267), fasting insulin (n = 48,395) and HbA1c (n = 23,357) in participants without diabetes from the diverse PAGE Study (23% self-reported African American, 46% Hispanic/Latino, 40% European, 4% Asian, 3% Native Hawaiian, 0.8% Native American), performing transethnic and population-specific GWAS meta-analyses, followed by fine-mapping to identify and characterise novel loci and independent secondary signals in known loci.ResultsFour novel associations were identified (p < 5 × 10-9), including three loci associated with fasting insulin, and a novel, low-frequency African American-specific locus associated with fasting glucose. Additionally, seven secondary signals were identified, including novel independent secondary signals for fasting glucose at the known GCK locus and for fasting insulin at the known PPP1R3B locus in transethnic meta-analysis.Conclusions/interpretationOur findings provide new insights into the genetic architecture of glycaemic traits and highlight the continued importance of conducting genetic studies in diverse populations.Data availabilityFull summary statistics from each of the population-specific and transethnic results are available at NHGRI-EBI GWAS catalog ( https://www.ebi.ac.uk/gwas/downloads/summary-statistics ).
Project description:Dynamically regulated changes in chromatin states are vital for normal development and can produce disease when they go awry. Accordingly, much effort has been devoted to characterizing these states under normal and pathological conditions. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the most widely used method to characterize where in the genome transcription factors, modified histones, modified nucleotides and chromatin binding proteins are found; bisulfite sequencing (BS-seq) and its variants are commonly used to characterize the locations of DNA modifications. Though very powerful, these methods are not without limitations. Notably, they are best at characterizing one chromatin feature at a time, yet chromatin features arise and function in combination. Investigators commonly superimpose separate ChIP-seq or BS-seq datasets, and then infer where chromatin features are found together. While these inferences might be correct, they can be misleading when the chromatin source has distinct cell types, or when a given cell type exhibits any cell to cell variation in chromatin state. These ambiguities can be eliminated by robust methods that directly characterize the existence and genomic locations of combinations of chromatin features in very small inputs of cells or ideally, single cells. Here we review single molecule epigenomic methods under development to overcome these limitations, the technical challenges associated with single molecule methods and their potential application to single cells.
Project description:A recent genome-wide association study (GWAS) of central obesity identified 27 loci, from sex-combined analysis, associated with waist-to-hip ratio adjusted for body-mass index (WHRadjBMI) in European-ancestry individuals. Nevertheless, the identified variants may not be the biological causal ones due to the presence of linkage disequilibrium (LD). To better understand the mechanisms underlying the identified loci from the GWAS meta-analysis, we first imputed summary statistics at GWAS loci to increase genetic resolution, and then we applied a Bayesian statistical fine-mapping method through PAINTOR, incorporating LD structure and functional annotations to select and prioritize the most plausible causal variants across WHRadjBMI-associated regions. Using adipose tissue- and cell-specific annotations that showed significant associations with WHRadjBMI, we identified 33 single-nucleotide polymorphisms (SNPs) from 27 sex-combined fine-mapping loci with posterior probability of causality greater than 0.9. Six of the selected 33 SNPs belong to at least one of the top five identified annotations. SNPs rs1440372 (SMAD6) and rs12608504 (JUND) are particularly important since they not only have associated functional annotations but are also GWA hits in the original study. Incorporation of functional annotations helps identify additional plausible causal variants, such as rs2213731 (DNM3-PIGC) and rs4531856 (JUND), that did not reach genome-wide significance in GWAS. Our results provide promising candidates for future functional validation experiments.
Project description:Genome-wide association studies (GWASs) are instrumental in identifying loci harboring common single-nucleotide variants (SNVs) that affect human traits and diseases. GWAS hits emerge in clusters, but the focus is often on the most significant hit in each trait- or disease-associated locus. The remaining hits represent SNVs in linkage disequilibrium (LD) and are considered redundant and thus frequently marginally reported or exploited. Here, we interrogate the value of integrating the full set of GWAS hits in a locus repeatedly associated with cardiac conduction traits and arrhythmia, SCN5A-SCN10A. Our analysis reveals 5 common 7-SNV haplotypes (Hap1-5) with 2 combinations associated with life-threatening arrhythmia-Brugada syndrome (the risk Hap1/1 and protective Hap2/3 genotypes). Hap1 and Hap2 share 3 SNVs; thus, this analysis suggests that assuming redundancy among clustered GWAS hits can lead to confounding disease-risk associations and supports the need to deconstruct GWAS data in the context of haplotype composition.
Project description:Previous studies have identified 41 independent genome-wide significant psoriasis susceptibility loci. After our first psoriasis genome-wide association study, we designed a custom genotyping array to fine-map eight genome-wide significant susceptibility loci known at that time (IL23R, IL13, IL12B, TNIP1, MHC, TNFAIP3, IL23A and RNF114) enabling genotyping of 2269 single-nucleotide polymorphisms (SNPs) in the eight loci for 2699 psoriasis cases and 2107 unaffected controls of European ancestry. We imputed these data using the latest 1000 Genome reference haplotypes, which included both indels and SNPs, to increase the marker density of the eight loci to 49?239 genetic variants. Using stepwise conditional association analysis, we identified nine independent signals distributed across six of the eight loci. In the major histocompatibility complex (MHC) region, we detected three independent signals at rs114255771 (P = 2.94 × 10(-74)), rs6924962 (P = 3.21 × 10(-19)) and rs892666 (P = 1.11 × 10(-10)). Near IL12B we detected two independent signals at rs62377586 (P = 7.42 × 10(-16)) and rs918518 (P = 3.22 × 10(-11)). Only one signal was observed in each of the TNIP1 (rs17728338; P = 4.15 × 10(-13)), IL13 (rs1295685; P = 1.65 × 10(-7)), IL23A (rs61937678; P = 1.82 × 10(-7)) and TNFAIP3 (rs642627; P = 5.90 × 10(-7)) regions. We also imputed variants for eight HLA genes and found that SNP rs114255771 yielded a more significant association than any HLA allele or amino-acid residue. Further analysis revealed that the HLA-C*06-B*57 haplotype tagged by this SNP had a significantly higher odds ratio than other HLA-C*06-bearing haplotypes. The results demonstrate allelic heterogeneity at IL12B and identify a high-risk MHC class I haplotype, consistent with the existence of multiple psoriasis effectors in the MHC.
Project description:Genome-wide association studies (GWASs) have enabled unbiased identification of genetic loci contributing to common complex diseases. Because GWAS loci often harbor many variants and genes, it remains a major challenge to move from GWASs' statistical associations to the identification of causal variants and genes that underlie these association signals. Researchers have applied many statistical and functional fine-mapping strategies to prioritize genetic variants and genes as potential candidates. There is no gold standard in fine-mapping approaches, but consistent results across different approaches can improve confidence in the fine-mapping findings. Here, we combined text mining with a systematic review and formed a catalog of 85 studies with evidence of fine mapping for at least one autoimmune GWAS locus. Across all fine-mapping studies, we compiled 230 GWAS loci with allelic heterogeneity estimates and predictions of causal variants and trait-relevant genes. These 230 loci included 455 combinations of locus-by-disease association signals with 15 autoimmune diseases. Using these estimates, we assessed the probability of mediating disease risk associations across genes in GWAS loci and identified robust signals of causal disease biology. We predict that this comprehensive catalog of GWAS fine-mapping efforts in autoimmune disease will greatly help distill the plethora of information in the field and inform therapeutic strategies.
Project description:The germinal center (GC) response is critical for both effective adaptive immunity and establishing peripheral tolerance by limiting autoreactive B cells. Dysfunction in these processes can lead to defective immune responses to infection or contribute to autoimmune disease. To understand the gene regulatory principles underlying the GC response, we generated a single-cell transcriptomic and epigenomic atlas of the human tonsil, a widely studied and representative lymphoid tissue. We characterize diverse immune cell subsets and build a trajectory of dynamic gene expression and transcription factor activity during B cell activation, GC formation, and plasma cell differentiation. We subsequently leverage cell type–specific transcriptomic and epigenomic maps to interpret potential regulatory impact of genetic variants implicated in autoimmunity, revealing that many exhibit their greatest regulatory potential in GC-associated cellular populations. These included gene loci linked with known roles in GC biology (IL21, IL21R, IL4R, and BCL6) and transcription factors regulating B cell differentiation (POU2AF1 and HHEX). Together, these analyses provide a powerful new cell type–resolved resource for the interpretation of cellular and genetic causes underpinning autoimmune disease.