Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

0

ENCODE HudsonAlpha Methyl-seq


ABSTRACT: This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Florencia Pauli mailto:fpauli@hudsonalpha.org). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track shows average methylation status in CpG islands. In general, methylation of CpG sites within a promoter causes silencing of the gene associated with that promoter For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf CpG regions were assayed via Methyl-seq, a method developed in the Myers laboratory to measure the methylation status at CpGs throughout the genome. It combines DNA digestion by a methyl-sensitive enzyme HpaII and its methyl-insensitive isoschizomer MspI with the Illumina DNA sequencing platform. The method was first applied in a collaboration with the laboratory of Dr. Julie Baker at Stanford University to study methylation and gene expression changes that occur in human embryonic stem cells before and after differentiation to definitive endoderm. A paper describing the results as well as the method has been submitted for publication [1]. This study profiled genomic DNA and mRNA samples derived from two human embryonic stem cell lines: H9 and BG02. These cells were differentiated into definitive endoderm, embryoid bodies, embryoid body-derived cells, and AFP+ (alpha-fetoprotein positive) hepatocytes. These in vitro samples were profiled with Methyl-seq and compared them with normal tissue samples from 11-week and 24-week fetal liver and adult liver. Methyl-seq assays more than 250,000 methyl-sensitive restriction enzyme cleavage sites, representing more than 90,000 genomic regions. These regions include 35,528 annotated CpG islands, while the remaining 55,084 non-CpG island regions are distributed across the genome in promoters, genes, and intergenic regions. Sequence tags present in MspI libraries but not in HpaII libraries are derived from methylated regions. Conversely, sequence tags that occur in HpaII libraries come from at least partially unmethylated regions. In vitro differentiation: Definitive endoderm precursor cells were generated from H9 hES cells by treating them with activin A. Embryoid bodies (EBs) were generated by growing undifferentitated H9 and BG02 hESCs in suspension. EB-derived cells were obtained by plating clumps of the cells from the EBs. AFP+ fetal hepatocytes were derived from EBs by plating EB cells with FgF, followed by fluorescence activated cell sorting (FACS) to isolate cells expressing the green fluorescent protein (GFP) reporter gene driven from the AFP promoter. Isolation of genomic DNA: Genomic DNA is isolated from biological replicates of each cell line by using the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by the manufacturer. DNA concentrations and a level of quality of each preparation is determined by UV absorbance. HpaII and MspI digestions: Cleavage of DNA by restriction endonuclease HpaII is prevented by the presence of a 5-methyl group at the internal C residue of its recognition sequence CCGG. MspI, an isoschizomer of HpaII, cleaves DNA irrespective of the presence of a methyl group at this position. For the MspI library, 5 µg genomic DNA was digested in a 100 µl reaction with 1X NEB Buffer2 and 20 units MspI restriction enzyme and incubated for 18 hr at 37°C. For the HpaII library, 5 µg genomic DNA was digested in a 100 µl reaction with 1X NEB Buffer1 and 20 units HpaII restriction enzyme and incubated for 18 hr at 37°C. Note that in subsequent versions of the Methyl-seq protocol, which will be described later, much lower amounts of genomic DNA were used (1 µg and potentially lower). DNA library construction and sequencing: High-throughput sequencing libraries were generated from DNA fragments of the HpaII or MspI digested genomic DNA according to the protocol posted at the website: http://myers.hudsonalpha.org/content/protocols.html. This approach was recently modified by removing the first PCR amplification step, just prior to the gel electrophoresis size-selection step, which was found to reduce a fragment-size bias in the sequencing libraries. These libraries were sequenced with an Illumina Genome Analyzer (GA2) according to the manufacturer's recommendations. Data analysis: For this analyis, reads that align to human genome sequence version hg19 and contain the 5'-CGG-3' HpaII-cut signature on their 5' end were used. These aligned sequence reads were mapped to CCGG sites predicted in silico on hg19. Sites with four or more Msp1 tags occurring in either the forward or reverse direction were retained for analysis. These "assayable" sites were then grouped with neighboring sites that are within 35-75 bp of each other. Thus, a "region" can be comprised of between 2 and 18 digestion sites that are each within 35-75 bp of another site. Methylated and non-methylated calls were made by using HpaII tag data from all assayable cut sites. For each site across each region, the larger of either the forward read count or reverse read count was used. Regions that have an average of 0 or 1 read per cut site are called methylated, and regions with more than one sequence read per site are called unmethylated.

ORGANISM(S): Homo sapiens

SUBMITTER: UCSC ENCODE DCC 

PROVIDER: E-GEOD-41304 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

Similar Datasets

2009-12-08 | GSE19363 | GEO
2010-05-18 | E-GEOD-19363 | biostudies-arrayexpress
2013-07-30 | GSE42723 | GEO
2016-06-22 | E-GEOD-83595 | biostudies-arrayexpress
2015-09-10 | GSE51680 | GEO
2013-07-30 | E-GEOD-42723 | biostudies-arrayexpress
2008-04-11 | GSE8890 | GEO
2016-06-22 | GSE83595 | GEO
2012-12-20 | E-GEOD-41069 | biostudies-arrayexpress
2012-08-24 | E-GEOD-40158 | biostudies-arrayexpress