Project description:BackgroundThere has been a steady increase in the number of studies aiming to identify DNA methylation differences associated with complex phenotypes. Many of the challenges of epigenetic epidemiology regarding study design and interpretation have been discussed in detail, however there are analytical concerns that are outstanding and require further exploration. In this study we seek to address three analytical issues. First, we quantify the multiple testing burden and propose a standard statistical significance threshold for identifying DNA methylation sites that are associated with an outcome. Second, we establish whether linear regression, the chosen statistical tool for the majority of studies, is appropriate and whether it is biased by the underlying distribution of DNA methylation data. Finally, we assess the sample size required for adequately powered DNA methylation association studies.ResultsWe quantified DNA methylation in the Understanding Society cohort (n = 1175), a large population based study, using the Illumina EPIC array to assess the statistical properties of DNA methylation association analyses. By simulating null DNA methylation studies, we generated the distribution of p-values expected by chance and calculated the 5% family-wise error for EPIC array studies to be 9 × 10- 8. Next, we tested whether the assumptions of linear regression are violated by DNA methylation data and found that the majority of sites do not satisfy the assumption of normal residuals. Nevertheless, we found no evidence that this bias influences analyses by increasing the likelihood of affected sites to be false positives. Finally, we performed power calculations for EPIC based DNA methylation studies, demonstrating that existing studies with data on ~ 1000 samples are adequately powered to detect small differences at the majority of sites.ConclusionWe propose that a significance threshold of P < 9 × 10- 8 adequately controls the false positive rate for EPIC array DNA methylation studies. Moreover, our results indicate that linear regression is a valid statistical methodology for DNA methylation studies, despite the fact that the data do not always satisfy the assumptions of this test. These findings have implications for epidemiological-based studies of DNA methylation and provide a framework for the interpretation of findings from current and future studies.
Project description:DNA methylation analysis in oropharyngeal squamous carcinoma (OPSCC) samples and oropharyngeal non-cancerous mucosa samples. Infinium MethylationEPIC BeadChip Kit was used to obtain DNA methylation profiles across more than 850,000 CpG sites. Total samples included 89 OPSCC samples and 5 non-cancerous mucosa samples.
Project description:The level of dNA methylation in BRE80-BRE80-T5 and T47D cells expressing active and inctive DNMT3A was quantified using EPIC array across more than 850,000 CpGs
Project description:Comprehensive DNA methylation analysis in malignant melanoma clinical samples and primary melanocyte culture. Infinium MethylationEPIC BeadChip was used to obtain DNA methylation profiles. Samples included 8 malignant melanoma cases and primary melanocyte culture.
Project description:In human, the 39 coding HOX genes and 18 referenced non-coding antisense transcripts are arranged in four genomic clusters named HOXA, B, C, and D. This highly conserved family belongs to the homeobox class of genes that encode transcription factors required for normal development. Therefore, HOX gene deregulation might contribute to the development of many cancer types. Here, we study HOX gene deregulation in adult glioma, a common type of primary brain tumor. We performed extensive molecular analysis of tumor samples, classified according to their isocitrate dehydrogenase (IDH1) gene mutation status, and of glioma stem cells. We found widespread expression of sense and antisense HOX transcripts only in aggressive (IDHwt) glioma samples, although the four HOX clusters displayed DNA hypermethylation. Integrative analysis of expression-, DNA methylation- and histone modification signatures along the clusters revealed that HOX gene upregulation relies on canonical and alternative bivalent CpG island promoters that escape hypermethylation. H3K27me3 loss at these promoters emerges as the main cause of widespread HOX gene upregulation in IDHwt glioma cell lines and tumors. Our study provides the first comprehensive description of the epigenetic changes at HOX clusters and their contribution to the transcriptional changes observed in adult glioma. It also identified putative "master" HOX proteins that might contribute to the tumorigenic potential of glioma stem cells.
Project description:DNA methylation is one of the major epigenetic modifications and has frequently demonstrated its suitability as diagnostic and prognostic biomarker. In addition to chip and sequencing based epigenome wide methylation profiling methods, targeted bisulfite sequencing (TBS) has been established as a cost-effective approach for routine diagnostics and target validation applications. Yet, an easy-to-use tool for the analysis of TBS data in combination with array-based methylation results has been missing. Consequently, we have developed EPIC-TABSAT, a user-friendly web-based application for the analysis of targeted sequencing data that additionally allows the integration of array-based methylation results. The tool can handle multiple targets as well as multiple sequencing files in parallel and covers the complete data analysis workflow from calculation of quality metrics to methylation calling and interactive result presentation. The graphical user interface offers an unprecedented way to interpret TBS data alone or in combination with array-based methylation studies. Together with the computation of target-specific epialleles it is useful in validation, research, and routine diagnostic environments. EPIC-TABSAT is freely accessible to all users at https://tabsat.ait.ac.at/.
Project description:To measure methylation, we employed Methyl-CpG-immunoprecipitation (MCIP) a technique which relies on a fusion protein consisting of the methyl-binding domain (MBD) of MBD2 and the Fc portion of IgG1 to detect methylated regions, exploiting the natural preference of MBD for 5-methylcytosine (5-mC). MCIP-seq was performed using the EpiMark® Methylated DNA Enrichment Kit. MCIP-seq of 77 primary samples obtained from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). Moreover, MCIP-seq of CD34+ HSPCs from 3 healthy donors is included. Due to patient confidentiality considerations, the raw data files for this dataset have been deposited to the EGA controlled-access archive under the accession numbers EGAS00001007094 (study); EGAD00001011052 (dataset).
Project description:Background: Epigenome-wide association studies (EWAS) have been widely applied to identify methylation CpG sites associated with human disease. To date, the Infinium Methylation EPIC array (EPIC) is commonly used for high-throughput DNA methylation profiling. However, the EPIC array covers only 30% of the human methylome. Methylation Capture bisulfite sequencing (MC-seq) captures target regions of methylome and has advantages of extensive coverage in the methylome at an affordable price. Methods: Epigenome-wide DNA methylation in four peripheral blood mononuclear cell samples was profiled by using SureSelectXT Methyl-Seq for MC-seq and EPIC platforms separately. CpG site-based reproducibility of MC-seq was assessed with DNA sample inputs ranging in quantity of high (> 1000ng), medium (300-1000ng), and low (150ng-300ng). To compare the performance of MC-seq and the EPIC arrays, we conducted a Pearson correlation and methylation value difference at each CpG site that was detected by both MC-seq and EPIC. We compared the percentage and counts in each CpG island and gene annotation between MC-seq and the EPIC array. Results: After quality control, an average of 3,708,550 CpG sites per sample was detected by MC-seq with DNA quantity >1000ng. Reproducibility of MC-seq detected CpG sites was high with strong correlation estimates for CpG methylation among samples with high, medium, and low DNA inputs (r > 0.96). The EPIC array captured an average of 846,464 CpG sites per sample. Compared with the EPIC array, MC-seq detected more CpGs in coding regions and CpG islands. Among the 472,540 CpG sites captured by both platforms, methylation of a majority of CpG sites was highly correlated in the same sample (r: 0.98~0.99). However, methylation for a small proportion of CpGs (N=235) differed significantly between the two platforms, with differences in beta values of greater than 0.5. Conclusions: Our results show that MC-seq is an efficient and reliable platform for methylome profiling with a broader coverage of the methylome than the array-based platform. Although methylation measurements in majority of CpGs are highly correlated, a number of CpG sites show large discrepancy between the two platforms, which warrants further investigation and needs cautious interpretation.