Project description:Fourteen years after the first genome-wide association study (GWAS) of lung cancer was published, approximately 45 genomic loci have now been significantly associated with lung cancer risk. While functional characterization was performed for several of these loci, a comprehensive summary of the current molecular understanding of lung cancer risk has been lacking. Further, many novel computational and experimental tools now became available to accelerate the functional assessment of disease-associated variants, moving beyond locus-by-locus approaches. In this review, we first highlight the heterogeneity of lung cancer GWAS findings across histological subtypes, ancestries and smoking status, which poses unique challenges to follow-up studies. We then summarize the published lung cancer post-GWAS studies for each risk-associated locus to assess the current understanding of biological mechanisms beyond the initial statistical association. We further summarize strategies for GWAS functional follow-up studies considering cutting-edge functional genomics tools and providing a catalog of available resources relevant to lung cancer. Overall, we aim to highlight the importance of integrating computational and experimental approaches to draw biological insights from the lung cancer GWAS results beyond association.
Project description:Genome-wide association studies (GWAS) have successfully identified thousands of genetic variants contributing to disease and other phenotypes. However, significant obstacles hamper our ability to elucidate causal variants, identify genes affected by causal variants, and characterize the mechanisms by which genotypes influence phenotypes. The increasing availability of genome-wide functional annotation data is providing unique opportunities to incorporate prior information into the analysis of GWAS to better understand the impact of variants on disease etiology. Although there have been many advances in incorporating prior information into prioritization of trait-associated variants in GWAS, functional annotation data have played a secondary role in the joint analysis of GWAS and molecular (i.e., expression) quantitative trait loci (eQTL) data in assessing evidence for association. To address this, we develop a novel mediation framework, iFunMed, to integrate GWAS and eQTL data with the utilization of publicly available functional annotation data. iFunMed extends the scope of standard mediation analysis by incorporating information from multiple genetic variants at a time and leveraging variant-level summary statistics. Data-driven computational experiments convey how informative annotations improve single-nucleotide polymorphism (SNP) selection performance while emphasizing robustness of iFunMed to noninformative annotations. Application to Framingham Heart Study data indicates that iFunMed is able to boost detection of SNPs with mediation effects that can be attributed to regulatory mechanisms.
Project description:Recent genome-wide association studies (GWAS) have identified several gene variants associated with sporadic chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL). Many of these CLL/SLL susceptibility loci are located in non-coding or intergenic regions, posing a significant challenge to determine their potential functional relevance. Here, we review the literature of all CLL/SLL GWAS and validation studies, and apply eQTL analysis to identify putatively functional SNPs that affect gene expression that may be causal in the pathogenesis of CLL/SLL. We tested 12 independent risk loci for their potential to alter gene expression through cis-acting mechanisms, using publicly available gene expression profiles with matching genotype information. Sixteen SNPs were identified that are linked to differential expression of SP140, a putative tumor suppressor gene previously associated with CLL/SLL. Three additional SNPs were associated with differential expression of DACT3 and GNG8, which are involved in the WNT/β-catenin- and G protein-coupled receptor signaling pathways, respectively, that have been previously implicated in CLL/SLL pathogenesis. Using in silico functional prediction tools, we found that 14 of the 19 significant eQTL SNPs lie in multiple putative regulatory elements, several of which have prior implications in CLL/SLL or other hematological malignancies. Although experimental validation is needed, our study shows that the use of existing GWAS data in combination with eQTL analysis and in silico methods represents a useful starting point to screen for putatively causal SNPs that may be involved in the etiology of CLL/SLL.
Project description:Genome-wide association studies have confirmed the involvement of non-coding angiopoietin-like 3 (ANGPTL3) gene variants with coronary artery disease, levels of low-density lipoprotein cholesterol (LDL-C), triglycerides and ANGPTL3 mRNA transcript. Extensive linkage disequilibrium at the locus, however, has hindered efforts to identify the potential functional variants. Using regulatory annotations from ENCODE, combined with functional in vivo assays such as allele-specific formaldehyde-assisted isolation of regulatory elements, statistical approaches including eQTL/lipid colocalisation, and traditional in vitro methodologies including electrophoretic mobility shift assay and luciferase reporter assays, variants affecting the ANGPTL3 regulome were examined. From 253 variants associated with ANGPTL3 mRNA expression, and/or lipid traits, 46 were located within liver regulatory elements and potentially functional. One variant, rs10889352, demonstrated allele-specific effects on DNA-protein interactions, reporter gene expression and chromatin accessibility, in line with effects on LDL-C levels and expression of ANGPTL3 mRNA. The ANGPTL3 gene lies within DOCK7, although the variant is within non-coding regions outside of ANGPTL3, within DOCK7, suggesting complex long-range regulatory effects on gene expression. This study illustrates the power of combining multiple genome-wide datasets with laboratory data to localise functional non-coding variation and provides a model for analysis of regulatory variants from GWAS.
Project description:Over the past decade, hundreds of genome-wide association studies (GWAS) have implicated genetic variants in various diseases, including cancer. However, only a few of these variants have been functionally characterized to date, mainly because the majority of the variants reside in non-coding regions of the human genome with unknown function. A comprehensive functional annotation of the candidate variants is thus necessary to fill the gap between the correlative findings of GWAS and the development of therapeutic strategies. By integrating large-scale multi-omics datasets such as the Cancer Genome Atlas (TCGA) and the Encyclopedia of DNA Elements (ENCODE), we performed multivariate linear regression analysis of expression quantitative trait loci, sequence permutation test of transcription factor binding perturbation, and modeling of three-dimensional chromatin interactions to analyze the potential molecular functions of 2,813 single nucleotide variants in 93 genomic loci associated with estrogen receptor-positive breast cancer. To facilitate rapid progress in functional genomics of breast cancer, we have created "Analysis of Breast Cancer GWAS" (ABC-GWAS), an interactive database of functional annotation of estrogen receptor-positive breast cancer GWAS variants. Our resource includes expression quantitative trait loci, long-range chromatin interaction predictions, and transcription factor binding motif analyses to prioritize putative target genes, causal variants, and transcription factors. An embedded genome browser also facilitates convenient visualization of the GWAS loci in genomic and epigenomic context. ABC-GWAS provides an interactive visual summary of comprehensive functional characterization of estrogen receptor-positive breast cancer variants. The web resource will be useful to both computational and experimental biologists who wish to generate and test their hypotheses regarding the genetic susceptibility, etiology, and carcinogenesis of breast cancer. ABC-GWAS can also be used as a user-friendly educational resource for teaching functional genomics. ABC-GWAS is available at http://education.knoweng.org/abc-gwas/.
Project description:The reconciliation between Mendelian inheritance of discrete traits and the genetically based correlation between relatives for quantitative traits was Fisher's infinitesimal model of a large number of genetic variants, each with very small effects, whose causal effects could not be individually identified. The development of genome-wide genetic association studies (GWAS) raised the hope that it would be possible to identify single polymorphic variants with identifiable functional effects on complex traits. It soon became clear that, with larger and larger GWAS on more and more complex traits, most of the significant associations had such small effects, that identifying their individual functional effects was essentially hopeless. Polygenic risk scores that provide an overall estimate of the genetic propensity to a trait at the individual level have been developed using GWAS data. These provide useful identification of groups of individuals with substantially increased risks, which can lead to recommendations of medical treatments or behavioral modifications to reduce risks. However, each such claim will require extensive investigation to justify its practical application. The challenge now is to use limited genetic association studies to find individually identifiable variants of significant functional effect that can help to understand the molecular basis of complex diseases and traits, and so lead to improved disease prevention and treatment. This can best be achieved by 1) the study of rare variants, often chosen by careful candidate assessment, and 2) the careful choice of phenotypes, often extremes of a quantitative variable, or traits with relatively high heritability.
Project description:Tuberculosis (TB) is a serious health issue in the developing world. Lack of knowledge on the etiological mechanisms of TB hinders the development of effective strategies for the treatment or prevention of TB disease. Human genetic study is an indispensable approach to understand the molecular basis of common diseases. Numerous efforts were made to screen the human genome for TB susceptibility by linkage mapping. A large number of candidate-based association studies of TB were conducted to examine the association of predicted functional DNA variations in candidate genes. Recently, the first genome-wide association study (GWAS) on TB was reported. The GWAS is a proof-of-principle evidence that justifies the genetic approach to understand TB. Further hypothesis-free efforts on TB research may renovate the traditional idea of TB genetic susceptibility as none of the candidate genes with important roles in containing Mycobacterium tuberculosis (MTB) infection was identified of association with active TB, whereas the TB-associated loci in the GWAS harbors no gene with function in MTB infection.
Project description:BackgroundGenome-wide association studies (GWASs) have identified multiple risk loci for bipolar disorder (BD). However, pinpointing functional (or causal) variants in the reported risk loci and elucidating their regulatory mechanisms remain challenging.MethodsWe first integrated chromatin immunoprecipitation sequencing (ChIP-Seq) data from human brain tissues (or neuronal cell lines) and position weight matrix (PWM) data to identify functional single-nucleotide polymorphisms (SNPs). Then, we verified the regulatory effects of these transcription factor (TF) binding-disrupting SNPs (hereafter referred to as "functional SNPs") through a series of experiments, including reporter gene assays, allele-specific expression (ASE) analysis, TF knockdown, CRISPR/Cas9-mediated genome editing, and expression quantitative trait loci (eQTL) analysis. Finally, we overexpressed PACS1 (whose expression was most significantly associated with the identified functional SNPs rs10896081 and rs3862386) in mouse primary cortical neurons to investigate if PACS1 affects dendritic spine density.ResultsWe identified 16 functional SNPs (in 9 risk loci); these functional SNPs disrupted the binding of 7 TFs, for example, CTCF and REST binding was frequently disrupted. We then identified the potential target genes whose expression in the human brain was regulated by these functional SNPs through eQTL analysis. Of note, we showed dysregulation of some target genes of the identified TF binding-disrupting SNPs in BD patients compared with controls, and overexpression of PACS1 reduced the density of dendritic spines, revealing the possible biological mechanisms of these functional SNPs in BD.ConclusionsOur study identifies functional SNPs in some reported risk loci and sheds light on the regulatory mechanisms of BD risk variants. Further functional characterization and mechanistic studies of these functional SNPs and candidate genes will help to elucidate BD pathogenesis and develop new therapeutic approaches and drugs.
Project description:IntroductionStroke is a multifactorial and heterogeneous disorder, correlates with heritability and considered as one of the major diseases. The prior reports performed the variable models such as genome-wide association studies (GWAS), replication, case-control, cross-sectional and meta-analysis studies and still, we lack diagnostic marker in the global world. There are limited studies were carried out in Saudi population, and we aim to investigate the molecular association of single nucleotide polymorphisms (SNPs) identified through GWAS and meta-analysis studies in stroke patients in the Saudi population.MethodsIn this case-control study, we have opted gender equality of 207 cases and 207 controls from the capital city of Saudi Arabia in King Saud University Hospital. The peripheral blood (5 ml) sample will be collected in two different vacutainers, and three mL of the coagulated blood will be used for lipid analysis (biochemical tests) and two mL will be used for DNA analysis (molecular tests). Genomic DNA will be extracted with the collected blood samples, and specific primers will be designed for the opted SNPs (SORT1-rs646218 and OLR1-rs11053646 polymorphisms) and PCR-RFLP will be performed and randomly DNA sequencing will be carried out to cross check the results.ResultsThe rs646218 and rs11053646 polymorphisms were significantly associated with allele, genotype and dominant models with and without crude odds ratios (OR's) and Multiple logistic regression analysis (p < 0.05). Correlation between lipid profile and genotypes has confirmed the significant relation between triglycerides and rs646218 and rs1105364 6polymorphisms. However, rs11053646 polymorphism was correlated with HDLC (p = 0.04). Genotypes were examined in both males' vs. males and females' vs. females in cases and control and we concluded that in rs11053646 polymorphisms with male subjects compared between cases and controls found to be associated with dominant model heterozygote genotypes (p < 0.05).ConclusionThe results of the current study confirmed the SORT1 and OLR1 SNPs were associated in the Saudi population. The current results were in the association with the prior study results documented through GWAS and meta-analysis association. However, other ethnic population studies should be performed to rule out in the human hereditary diseases.