Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Florencia Pauli mailto:fpauli@hudsonalpha.org). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track is produced as part of the ENCODE project. The track displays copy number variation (CNV) as determined by the Illumina Human 1M-Duo Infinium HD BeadChip assay and circular binary segmentation (CBS). The Human 1M-Duo contains more than 1,100,000 tagSNP markers and a set of ~60,000 additional CNV-targeted markers. The median spacing between markers is 1.5 kb and the mean spacing is 2.4 kb. The B-allele frequency and genotyping single nucleotide polymorphism (SNP) data generated by the experiment are not displayed, but are available for download from the Downloads page. Where applicable, biological replicates of each cell line are reported separately. Possible uses of the data include correction of copy number in peak-calling for ChIP-seq, transcriptome, DNase hypersensitivity, and methylation determinations. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Isolation of genomic DNA and hybridization: Cells were grown according to the approved ENCODE cell culture protocols by the Myers lab and by other ENCODE production groups. The production group is reported in the metadata. Genomic DNA was isolated using the DNeasy Blood and Tissue Kit (Qiagen). DNA concentration and quality were determined by fluorescence (Invitrogen Quant-iT dsDNA High Sensitivity Kit and Qubit Fluorometer), and 400 nanograms of each sample were hybridized to Illumina 1M-Duo DNA Analysis BeadChips. Processing and Analysis: The genotypes from the 1M-Duo Arrays were ascertained with BeadStudio by using default settings and formatting with the A/B genotype designation for each SNP. Primary QC for each sample was a cut-off at a call rate of 0.95. Copy Number Variation (CNV) analysis was performed with circular binary segmentation (DNAcopy) of the log R ratio values at each probe (Olshen et al., 2004). The parameters used were alpha=0.001, nperm=5000, sd.undo=1. The copy number segments are reported with the mean log R ratio for each chromosomal segment called by CBS. Log ratios of ~-0.2 to -1.5 can be considered heterozygous deletions, < -1.5 homozygous deletions, and > 0.2 amplifications. Primary QC for each sample was SD of < 0.6.

Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Florencia Pauli mailto:fpauli@hudsonalpha.org). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track is produced as part of the ENCODE project. The track displays the methylation status of specific CpG dinucleotides in the given cell types as identified by the Illumina Infinium HumanMethylation27 BeadArray platform (http://www.illumina.com/pages.ilmn?ID=243). In general, methylation of CpG sites within a promoter causes silencing of the gene associated with that promoter. Detailed information for the CpG targets is in an XLS formatted spreadsheet on the Myers' lab protocols website (http://hudsonalpha.org/myers-lab/protocols). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell). Genomic DNA was isolated from each cell line with the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by the manufacturer. DNA concentrations and a level of quality of each preparation was determined by fluorescence with the Qubit Fluorometer (Invitrogen). The Methyl27K platform uses bisulfite treated genomic DNA to assay the methylation status of 27,578 CpG sites within more than 14,000 genes. When genomic DNA is treated with sodium bisulfite, unmethylated cytosine of CpG dinucleotides are converted into uracils; methylated cytosines do not get converted. After bisulfite treatment, the methylation status of a site is assayed by single base-pair extension with a Cy3 or Cy5 labeled nucleotide on oligo-beads specific for the methylated or unmethylated state. A beta value is calculated by Illumina's Bead Studio software for each CpG target. This value represents the intensity value from the methylated bead type divided by the sum of the intensity values from the methylated and unmethylated bead types for any given CpG target. Bisulfite conversion reaction was done using the Zymo Research EZ-96 DNA Methylation Kit (http://www.zymoresearch.com/epigenetics/dna-methylation/ez-96-dna-methylation-kit). One step of the protocol was modified. During the incubation, a 30 sec 95oC denaturing step every hour was included to increase reaction efficiency as recommended by the Illumina Infinium Human Methylation27 protocol. The bead arrays were run according to the protocol provided by Illumina (http://www.illumina.com/pagesnrn.ilmn?ID=275). The intensity data from the BeadArray was processed using Illumina's BeadStudio software with the Methylation Module v3.2. The data was then quality-filtered using p-values. Any beta value equal to or greater than 0.6 is considered fully methylated. Any beta value equal to or less than 0.2 is considered to be fully unmethylated. Beta values between 0.2 and 0.6 are considered to be partially methylated. Beta-values are quality filtered and spots that fall below the minimum intensity threshold are displayed as "NA". Score in the bed files is beta value x 1000

Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Florencia Pauli mailto:fpauli@hudsonalpha.org). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track is produced as part of the ENCODE project. The track reports the percentage of DNA molecules that exhibit cytosine methylation at specific CpG dinucleotides. In general, DNA methylation within a gene's promoter is associated with gene silencing, and DNA methylation within the exons and introns of a gene is associated with gene expression. Proper regulation of DNA methylation is essential during development and aberrant DNA methylation is a hallmark of cancer. DNA methylation status is assayed at more than 500,000 CpG dinucleotides in the genome using Reduced Representation Bisulfite Sequencing (RRBS). Genomic DNA is digested with the methyl-insensitive restriction enzyme MspI, small genomic DNA fragments are purified by gel electrophoresis, and then used to construct an Illumina sequencing library. The library fragments are treated with sodium bisulfite and amplified by PCR to convert every unmethylated cytosine to a thymidine while leaving methylated cytosines intact. The sequenced fragments are aligned to a customized reference genome sequence and for each assayed CpG we report the number of sequencing reads covering that CpG and the percentage of those reads that are methylated. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNA methylation at CpG sites was assayed with a modified version of Reduced Representation Bisulfite Sequencing (RRBS; Meissner et al., 2008). RRBS was performed on cell lines grown by many ENCODE production groups. The production group that grew the cells and isolated genomic DNA is indicated in the "obtainedBy" field of the metadata. When a cell type was provided by more than one lab, the data for the cells from only one lab are displayed in the table above. However, the data for every cell type from every lab is available from the Downloads page. RRBS was carried out by the Myers production group at the HudsonAlpha Institute for Biotechnology. Isolation of genomic DNA Genomic DNA is isolated from biological replicates of each cell line using the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by the manufacturer. DNA concentrations for each genomic DNA preparation are determined using fluorescent DNA binding dye and a fluorometer (Invitrogen Quant-iT dsDNA High Sensitivity Kit and Qubit Fluorometer). Typically, 1 µg of DNA is used to make an RRBS library; however, we have also had success in making libraries with 200 ng genomic DNA from rare or precious samples. RRBS library construction and sequencing RRBS library construction starts with MspI digestion of genomic DNA , which cuts at every CCGG regardless of methylation status. Klenow exo- DNA Polymerase is then used to fill in the recessed end of the genomic DNA and add an adenosine as a 3prime overhang. Next, a methylated version of the Illumina paired-end adapters is ligated onto the DNA. Adapter ligated genomic DNA fragments between 105 and 185 basepairs are selected using agarose gel electrophoresis and Qiagen Qiaquick Gel Extraction Kit. The selected adapter-ligated fragments are treated with sodium bisulfite using the Zymo Research EZ DNA Methylation Gold Kit, which converts unmethylated cytosines to uracils and leaves methylated cytosines unchanged. Bisulfite treated DNA is amplified in a final PCR reaction which has been optimized to uniformly amplify diverse fragment sizes and sequence contexts in the same reaction. During this final PCR reaction uracils are copied as thymines resulting in a thymine in the PCR products wherever an unmethylated cytosine existed in the genomic DNA. The sample is now ready for sequencing on the Illumina sequencing platform. These libraries were sequenced with an Illumina Genome Analyzer IIx according to the manufacturer's recommendations. Data analysis To analyze the sequence data, a reference genome is created that contains only the 36 base pairs adjacent to every MspI site and every C in those sequences is changed to T. A converted sequence read file is then created by changing each C in the original sequence reads to a T. The converted sequence reads are aligned to the converted reference genome, and only reads that map uniquely to the reference genome are kept. Once reads are aligned the percent methylation is calculated for each CpG using the original sequence reads. The percent methylation and number of reads is reported for each CpG.

Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Florencia Pauli mailto:fpauli@hudsonalpha.org). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track is produced as part of the ENCODE Project. RNA-seq is a method for mapping and quantifying the transcriptome of any organism that has a genomic DNA sequence assembly (Mortazavi et al., 2008). Biological replicates of ENCODE cell lines were grown on separate culture plates, total RNA was purified and polyA selected two times. mRNA was then fragmented by magnesium-catalyzed hydrolysis, reverse transcribed to cDNA by random priming and amplified. The cDNA was sequenced on an Illumina Genome Analyzer (GAI or GAIIx). The DNA sequences were aligned to the NCBI Build37 (hg19) version of the human genome using the sequence alignment programs ELAND (Illumina) or Bowtie (Langmead et al., 2009). The first 10 residues of sequencing have a weak characteristic nucleotide bias of unknown origin. This RNA-seq protocol does not specify the coding strand. As a result, there will be ambiguity at loci where both strands are transcribed. This is the first NCBI Build37 (hg19) release of this track (Jan 2012). This release includes the 3 datasets (Jurkat, A549/DEX100nm, and A549/EtOH2pct) previously released on NCBI Build36 (hg18) and adds data for several more cell types and growth conditions in replicate. Four types of download files are available for each replicate including the Raw Data (fastq), Transcripts GencodeV7 (gtf), Raw Signal (bigwig), and Alignments (bam). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Experimental Procedures Cells were grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell) except for H1-hESC for which frozen cell pellets were purchased from Cellular Dynamics. Cells were lysed in RLT buffer (Qiagen RNEasy kit) and processed on RNEasy midi columns according to the manufacturer's protocol, with the inclusion of the "on-column" DNase digestion step to remove residual genomic DNA. mRNA was isolated from at least 10 ug of total RNA with oligo(dT) two times (Dynabeads mRNA PurificationgKit, Invitrogen). Alternatively, cells were lysed and mRNA was purified directly two times with oligo(dT) (Dynabeads mRNA DIRECT Kit, Invitrogen). 100 ng of mRNA was fragmented by magnesium-catalyzed hydrolysis and reverse transcribed to cDNA by random priming according to the protocol in Mortazavi et al. (2008). cDNA was prepared for sequencing on the Genome Analyzer flowcell according to the protocol for the ChIPSeq DNA genomic DNA kit (Illumina). The sequencing libraries were size-selected around 225 bp and amplified with 15 rounds of PCR. Libraries were sequenced with an Illumina Genome Analyzer I or an Illumina Genome Analyzer IIx according to the manufacturer's recommendations. Single end reads of 36 nt in length were obtained. Data Processing and Analysis Fastq files were made from qseq files generated by the Illumina pipeline (Casava 1.7). The Raw Signal files (bigWig) were generated from bedgraph files and the score was calculated as the number of reads at that position divided by the total number of reads divided by one million. Casava export files were aligned to the NCBI Build37 (hg19) version of the human genome with ELAND (Illumina), generating SAM files. Fastq files of experiments that were previously aligned to NCBI Build36 (hg18) were aligned to NCBI Build37 (hg19) using Bowtie (Langmead et al., 2009; parameters: -S -n 2 -k 11 -m 10 --best), also generating SAM files. SAM files were converted to BAM with SAMtools (Li et al., 2009). Gene expression within Gencode.v7 (Harrow et al., 2006) gene models was estimated using Cufflinks v0.9.3 (Roberts et al., 2011). Estimates of transcript abundance were reported in Fragments Per Kilobase of exon per Million fragments mapped (FPKM). FPKM is calculated by dividing the total number of fragments that align to the gene model by the size of the spliced transcript (exons) in kilobases. This number is then divided by the total number of reads in millions for the experiment. FPKM is reported in the last column of the gtf (TranscriptGencV7) files. Raw Data (fastq), Raw Signal (bigWig), Alignments (bam) and Transcript Gencode V7 (gtf) files are available from the Downloads (http://hgwdev.cse.ucsc.edu/cgi-bin/hgFileUi?g=wgEncodeHaibRnaSeq) page.

Dataset Information

ENCODE HudsonAlpha Methyl-seq

Shared Molecules

Only show the datasets with similarity scores above: 0.5

Threshold

0.5

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets