MethReg: estimating the regulatory potential of DNA methylation in gene transcription.
Ontology highlight
ABSTRACT: Epigenome-wide association studies often detect many differentially methylated sites, and many are located in distal regulatory regions. To further prioritize these significant sites, there is a critical need to better understand the functional impact of CpG methylation. Recent studies demonstrated that CpG methylation-dependent transcriptional regulation is a widespread phenomenon. Here, we present MethReg, an R/Bioconductor package that analyzes matched DNA methylation and gene expression data, along with external transcription factor (TF) binding information, to evaluate, prioritize and annotate CpG sites with high regulatory potential. At these CpG sites, TF-target gene associations are often only present in a subset of samples with high (or low) methylation levels, so they can be missed by analyses that use all samples. Using colorectal cancer and Alzheimer's disease datasets, we show MethReg significantly enhances our understanding of the regulatory roles of DNA methylation in complex diseases.
Project description:High-throughput third-generation nanopore sequencing devices have enormous potential for simultaneously observing epigenetic modifications in human cells over large regions of the genome. However, signals generated by these devices are subject to considerable noise that can lead to unsatisfactory detection performance and hamper downstream analysis. Here we develop a statistical method, CpelNano, for the quantification and analysis of 5mC methylation landscapes using nanopore data. CpelNano takes into account nanopore noise by means of a hidden Markov model (HMM) in which the true but unknown ("hidden") methylation state is modeled through an Ising probability distribution that is consistent with methylation means and pairwise correlations, whereas nanopore current signals constitute the observed state. It then estimates the associated methylation potential energy function by employing the expectation-maximization (EM) algorithm and performs differential methylation analysis via permutation-based hypothesis testing. Using simulations and analysis of published data obtained from three human cell lines (GM12878, MCF-10A, and MDA-MB-231), we show that CpelNano can faithfully estimate DNA methylation potential energy landscapes, substantially improving current methods and leading to a powerful tool for the modeling and analysis of epigenetic landscapes using nanopore sequencing data.
Project description:DNA methylation generally functions as a repressive transcriptional signal, but it is also known to activate gene expression. In either case, the downstream factors remain largely unknown. By using comparative interactomics, we isolated proteins in Arabidopsis thaliana that associate with methylated DNA. Two SU(VAR)3-9 homologs, the transcriptional antisilencing factor SUVH1, and SUVH3, were among the methyl reader candidates. SUVH1 and SUVH3 bound methylated DNA in vitro, were associated with euchromatic methylation in vivo, and formed a complex with two DNAJ domain-containing homologs, DNAJ1 and DNAJ2. Ectopic recruitment of DNAJ1 enhanced gene transcription in plants, yeast, and mammals. Thus, the SUVH proteins bind to methylated DNA and recruit the DNAJ proteins to enhance proximal gene expression, thereby counteracting the repressive effects of transposon insertion near genes.
Project description:BackgroundIn a heterogeneous population of cells, individual cells can behave differently and respond variably to the environment. This cellular diversity can be assessed by measuring DNA methylation patterns. The loci with variable methylation patterns are informative of cellular heterogeneity and may serve as biomarkers of diseases and developmental progression. Cell-to-cell methylation heterogeneity can be evaluated through single-cell methylomes or computational techniques for pooled cells. However, the feasibility and performance of these approaches to precisely estimate methylation heterogeneity require further assessment.ResultsHere, we proposed model-based methods adopted from a mathematical framework originally from biodiversity, to estimate genome-wide DNA methylation heterogeneity. We evaluated the performance of our models and the existing methods with feature comparison, and tested on both synthetic datasets and real data. Overall, our methods have demonstrated advantages over others because of their better correlation with the actual heterogeneity. We also demonstrated that methylation heterogeneity offers an additional layer of biological information distinct from the conventional methylation level. In the case studies, we showed that distinct profiles of methylation heterogeneity in CG and non-CG methylation can predict the regulatory roles between genomic elements in Arabidopsis. This opens up a new direction for plant epigenomics. Finally, we demonstrated that our score might be able to identify loci in human cancer samples as putative biomarkers for early cancer detection.ConclusionsWe adopted the mathematical framework from biodiversity into three model-based methods for analyzing genome-wide DNA methylation heterogeneity to monitor cellular heterogeneity. Our methods, namely MeH, have been implemented, evaluated with existing methods, and are open to the research community.
Project description:Distal regulatory elements, including enhancers, play a critical role in regulating gene activity. Transcription factor binding to these elements correlates with Low Methylated Regions (LMRs) in a process that is poorly understood. Here we ask whether and how actual occupancy of DNA-binding factors is linked to DNA methylation at the level of individual molecules. Using CTCF as an example, we observe that frequency of binding correlates with the likelihood of a demethylated state and sites of low occupancy display heterogeneous DNA methylation within the CTCF motif. In line with a dynamic model of binding and DNA methylation turnover, we find that 5-hydroxymethylcytosine (5hmC), formed as an intermediate state of active demethylation, is enriched at LMRs in stem and somatic cells. Moreover, a significant fraction of changes in 5hmC during differentiation occurs at these regions, suggesting that transcription factor activity could be a key driver for active demethylation. Since deletion of CTCF is lethal for embryonic stem cells, we used genetic deletion of REST as another DNA-binding factor implicated in LMR formation to test this hypothesis. The absence of REST leads to a decrease of hydroxymethylation and a concomitant increase of DNA methylation at its binding sites. These data support a model where DNA-binding factors can mediate turnover of DNA methylation as an integral part of maintenance and reprogramming of regulatory regions.
Project description:Despite numerous studies done on understanding the role of DNA methylation, limited work has focused on systems integration of cell type-specific interplay between DNA methylation and gene transcription. Through a genome-wide analysis of DNA methylation across 19 cell types with T-47D as reference, we identified 106,252 cell type-specific differentially-methylated CpGs categorized into 7,537 differentially (46.6% hyper- and 53.4% hypo-) methylated regions. We found 44% promoter regions and 75% CpG islands were T-47D cell type-specific methylated. Pyrosequencing experiments validated the cell type-specific methylation across three benchmark cell lines. Interestingly, these DMRs overlapped with 1,145 known tumor suppressor genes. We then developed a Bayesian Gaussian Regression model to measure the relationship among DNA methylation, genomic segment distribution, differential gene expression and tumor suppressor gene status. The model uncovered that 3'UTR methylation has much less impact on transcriptional activity than other regions. Integration of DNA methylation and 82 transcription factor binding information across the 19 cell types suggested diverse interplay patterns between the two regulators. Our integrative analysis reveals cell type-specific and genomic region-dependent regulatory patterns and provides a perspective for integrating hundreds of various omics-seq data together.
Project description:DNA methylation plays a critical role in tumorigenesis through regulating oncogene activation and tumor suppressor gene silencing. Although extensively analyzed, the implication of DNA methylation in gene regulatory network is less characterized. To address this issue, in this study we performed an integrative analysis on the alteration of DNA methylation patterns and the dynamics of gene regulatory network topology across distinct stages of stomach cancer. We found the global DNA methylation patterns in different stages are generally conserved, whereas some significantly differentially methylated genes were exclusively observed in the early stage of stomach cancer. Integrative analysis of DNA methylation and network topology alteration yielded several genes which have been reported to be involved in the progression of stomach cancer, such as IGF2, ERBB2, GSTP1, MYH11, TMEM59, and SST. Finally, we demonstrated that inhibition of SST promotes cell proliferation, suggesting that DNA methylation-associated SST suppression possibly contributes to the gastric cancer progression. Taken together, our study suggests the DNA methylation-associated regulatory network analysis could be used for identifying cancer-related genes. This strategy can facilitate the understanding of gene regulatory network in cancer biology and provide a new insight into the study of DNA methylation at system level.
Project description:We extended the mathematical models of measuring biodiversity to estimate DNA methylation heterogeneity in a cell population. We propose a model-based approach (abundance-based, phylogeny-based and pairwise similarity-based heterogeneity) and consider similarity in DNA methylation patterns from individual cells to evaluate heterogeneity that overcomes biases due to missing data. We also applied commonly used non-model based method (methylation entropy) and other reported methods of estimating methylation heterogeneity such as single-cell based approach to evaluate methylation heterogeineity.
Project description:Expression quantitative trait methylation (eQTM) analysis identifies DNA CpG sites at which methylation is associated with gene expression. The present study describes an eQTM resource of CpG-transcript pairs derived from whole blood DNA methylation and RNA sequencing gene expression data in 2115 Framingham Heart Study participants. We identified 70,047 significant cis CpG-transcript pairs at p < 1E-7 where the top most significant eGenes (i.e., gene transcripts associated with a CpG) were enriched in biological pathways related to cell signaling, and for 1208 clinical traits (enrichment false discovery rate [FDR] ≤ 0.05). We also identified 246,667 significant trans CpG-transcript pairs at p < 1E-14 where the top most significant eGenes were enriched in biological pathways related to activation of the immune response, and for 1191 clinical traits (enrichment FDR ≤ 0.05). Independent and external replication of the top 1000 significant cis and trans CpG-transcript pairs was completed in the Women's Health Initiative and Jackson Heart Study cohorts. Using significant cis CpG-transcript pairs, we identified significant mediation of the association between CpG sites and cardiometabolic traits through gene expression and identified shared genetic regulation between CpGs and transcripts associated with cardiometabolic traits. In conclusion, we developed a robust and powerful resource of whole blood eQTM CpG-transcript pairs that can help inform future functional studies that seek to understand the molecular basis of disease.
Project description:OBJECTIVES:Systemic lupus erythematosus (SLE) is a chronic autoimmune condition with heterogeneous presentation and complex aetiology where DNA methylation changes are emerging as a contributing factor. In order to discover novel epigenetic associations and investigate their relationship to genetic risk for SLE, we analysed DNA methylation profiles in a large collection of patients with SLE and healthy individuals. METHODS:DNA extracted from blood from 548 patients with SLE and 587 healthy controls were analysed on the Illumina HumanMethylation 450?k BeadChip, which targets 485 000 CpG sites across the genome. Single nucleotide polymorphism (SNP) genotype data for 196 524 SNPs on the Illumina ImmunoChip from the same individuals were utilised for methylation quantitative trait loci (cis-meQTLs) analyses. RESULTS:We identified and replicated differentially methylated CpGs (DMCs) in SLE at 7245 CpG sites in the genome. The largest methylation differences were observed at type I interferon-regulated genes which exhibited decreased methylation in SLE. We mapped cis-meQTLs and identified genetic regulation of methylation levels at 466 of the DMCs in SLE. The meQTLs for DMCs in SLE were enriched for genetic association to SLE, and included seven SLE genome-wide association study (GWAS) loci: PTPRC (CD45), MHC-class III, UHRF1BP1, IRF5, IRF7, IKZF3 and UBE2L3. In addition, we observed association between genotype and variance of methylation at 20 DMCs in SLE, including at the HLA-DQB2 locus. CONCLUSIONS:Our results suggest that several of the genetic risk variants for SLE may exert their influence on the phenotype through alteration of DNA methylation levels at regulatory regions of target genes.
Project description:BackgroundAlcohol is a well-known risk factor for hepatocellular carcinoma (HCC), but the mechanisms underlying the alcohol-related hepatocarcinogenesis are still poorly understood. Alcohol alters the provision of methyl groups within the hepatic one-carbon metabolism, possibly inducing aberrant DNA methylation. Whether specific pathways are epigenetically regulated in alcohol-associated HCC is, however, unknown. The aim of the present study was to investigate the genome-wide promoter DNA methylation and gene expression profiles in non-viral, alcohol-associated HCC. From eight HCC patients undergoing curative surgery, array-based DNA methylation and gene expression data of all annotated genes were analyzed by comparing HCC tissue and homologous cancer-free liver tissue.ResultsAfter merging the DNA methylation with gene expression data, we identified 159 hypermethylated-repressed, 30 hypomethylated-induced, 49 hypermethylated-induced, and 56 hypomethylated-repressed genes. Notably, promoter DNA methylation emerged as a novel regulatory mechanism for the transcriptional repression of genes controlling the retinol metabolism (ADH1A, ADH1B, ADH6, CYP3A43, CYP4A22, RDH16), iron homeostasis (HAMP), one-carbon metabolism (SHMT1), and genes with a putative, newly identified function as tumor suppressors (FAM107A, IGFALS, MT1G, MT1H, RNF180).ConclusionsA genome-wide DNA methylation approach merged with array-based gene expression profiles allowed identifying a number of novel, epigenetically regulated candidate tumor-suppressor genes in alcohol-associated hepatocarcinogenesis. Retinol metabolism genes and SHMT1 are also epigenetically regulated through promoter DNA methylation in alcohol-associated HCC. Due to the reversibility of epigenetic mechanisms by environmental/nutritional factors, these findings may open up to novel interventional strategies for hepatocarcinogenesis prevention in HCC related to alcohol, a modifiable dietary component.