Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

DNaseI Digital Genomic Footprinting from ENCODE/University of Washington [Mouse]

ABSTRACT: This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Richard Sandstrom mailto:sull@u.washington.edu). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track, produced as part of the mouse ENCODE Project, contains deep sequencing DNase data that will be used to identify sites where regulatory factors bind to the genome (footprints). Footprinting is a technique used to define the DNA sequences that interact with and bind DNA-binding proteins, such as transcription factors, zinc-finger proteins, hormone-receptor complexes, and other chromatin-modulating factors like CTCF. The technique depends upon the strength and tight nature of protein-DNA interactions. In their native chromatin state, DNA sequences that interact directly with DNA-binding proteins are relatively protected from DNA degrading endonucleases, while the exposed/unbound portions are readily degraded by such endonucleases. A massively parallel next-generation sequencing technique to define the DNase hypersensitive sites in the genome was adopted. The DNase samples were sequenced using next-generation sequencing machines to significantly higher depths of 300-fold or greater. This produces a base-pair level resolution of the DNase susceptibility maps of the native chromatin state. These base-pair resolution maps represent and are dependent upon the nature and the specificity of interaction of the DNA with the regulatory/modulatory proteins binding at specific loci in the genome; thus they represent the native chromatin state of the genome under investigation. The deep sequencing approach has been used to define the footprint landscape of the genome by identifying DNA motifs that interact with known or novel DNA binding proteins. Cells were grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell/mouse). Digital DNaseI was performed by DNaseI digestion of intact nuclei, followed by isolating DNaseI "double-hit" fragments (Sabo et al., 2006), and direct sequencing of fragment ends (which correspond to in vivo DNaseI cleavage sites) using the Solexa platform (27 bp reads). High-quality reads were mapped to the NCBI37/mm9 mouse genome using Bowtie 0.12.5; only unique mappings were kept. DNaseI sensitivity is directly reflected in raw tag density (Raw Signal), which is shown in the track as density of tags mapping within a 150 bp sliding window (at a 20 bp step across the genome). DNaseI hypersensitive zones (HotSpots) were identified using the HotSpot algorithm (Sabo et al., 2004). False discovery rate thresholds of 1.0% (FDR 0.01) were computed for each cell type by applying the HotSpot algorithm to an equivalent number of random uniquely mapping 36-mers. DNaseI hypersensitive sites (DHSs or Peaks) were identified as signal peaks within 1.0% (FDR 0.01) hypersensitive zones using a peak-finding algorithm. Only DNase Solexa libraries from unique cell types producing the highest quality data, as defined by Percent Tags in Hotspots (PTIH ~40%), were designated for deep sequencing to a depth of over 200 million tags. Results were validated by conventional DNaseI hypersensitivity assays using end-labeling/Southern blotting methods.

ORGANISM(S): Mus musculus

SUBMITTER: UCSC ENCODE DCC

PROVIDER: E-GEOD-40869 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

ACCESS DATA

Similar Datasets

Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Richard Sandstrom mailto:sull@u.washington.edu). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track, produced as part of the ENCODE Project, contains deep sequencing DNase data that will be used to identify sites where regulatory factors bind to the genome (footprints). Footprinting is a technique used to define the DNA sequences that interact with and bind DNA-binding proteins, such as transcription factors, zinc-finger proteins, hormone-receptor complexes, and other chromatin-modulating factors like CTCF. The technique depends upon the strength and tight nature of protein-DNA interactions. In their native chromatin state, DNA sequences that interact directly with DNA-binding proteins are relatively protected from DNA degrading endonucleases, while the exposed/unbound portions are readily degraded by such endonucleases. A massively parallel next-generation sequencing technique to define the DNase hypersensitive sites in the genome was adopted. Sequencing these next-generation-sequencing DNase samples to significantly higher depths of 300-fold or greater produces a base-pair level resolution of the DNase susceptibility maps of the native chromatin state. These base-pair resolution maps represent and are dependent upon the nature and the specificity of interaction of the DNA with the regulatory/modulatory proteins binding at specific loci in the genome; thus they represent the native chromatin state of the genome under investigation. The deep sequencing approach has been used to define the footprint landscape of the genome by identifying DNA motifs that interact with known or novel DNA binding proteins. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols. Digital DNaseI was performed by DNaseI digestion of intact nuclei, followed by isolating DNaseI 'double-hit' fragments as described in Sabo et al. (2006), and direct sequencing of fragment ends (which correspond to in vivo DNaseI cleavage sites) using the Solexa platform (27 bp reads). High-quality reads were mapped to the GRCh37/hg19 human genome using Bowtie 0.12.5 (Eland was used to map to NCBI36/hg18); only unique mappings were kept. DNaseI sensitivity is directly reflected in raw tag density (Signal), which is shown in the track as density of tags mapping within a 150 bp sliding window (at a 20 bp step across the genome). DNaseI hypersensitive zones (HotSpots) were identified using the HotSpot algorithm described in Sabo et al. (2004). False discovery rate thresholds of 1.0% (FDR 0.01) were computed for each cell type by applying the HotSpot algorithm to an equivalent number of random uniquely mapping 36-mers. DNaseI hypersensitive sites (DHSs or Peaks) were identified as signal peaks within 1.0% (FDR 0.01) hypersensitive zones using a peak-finding algorithm. Only DNase Solexa libraries from unique cell types producing the highest quality data, as defined by Percent Tags in Hotspots (PTIH ~40%) were designated for deep sequencing to a depth of over 200 million tags.

Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Yijun Ruan mailto:ruanyj@gis.a-star.edu.sg). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). This track, produced as part of the ENCODE Project, contains deep sequencing DNase data that will be used to identify sites where regulatory factors bind to the genome (footprints). Footprinting is a technique used to define the DNA sequences that interact with and bind DNA-binding proteins, such as transcription factors, zinc-finger proteins, hormone-receptor complexes, and other chromatin-modulating factors like CTCF. The technique depends upon the strength and tight nature of protein-DNA interactions. In their native chromatin state, DNA sequences that interact directly with DNA-binding proteins are relatively protected from DNA degrading endonucleases, while the exposed/unbound portions are readily degraded by such endonucleases. A massively parallel next-generation sequencing technique to define the DNase hypersensitive sites in the genome was adopted. Sequencing these next-generation-sequencing DNase samples to significantly higher depths of 300-fold or greater produces a base-pair level resolution of the DNase susceptibility maps of the native chromatin state. These base-pair resolution maps represent and are dependent upon the nature and the specificity of interaction of the DNA with the regulatory/modulatory proteins binding at specific loci in the genome; thus they represent the native chromatin state of the genome under investigation. The deep sequencing approach has been used to define the footprint landscape of the genome by identifying DNA motifs that interact with known or novel DNA binding proteins. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols. Digital DNaseI was performed by DNaseI digestion of intact nuclei, followed by isolating DNaseI 'double-hit' fragments as described in Sabo et al. (2006), and direct sequencing of fragment ends (which correspond to in vivo DNaseI cleavage sites) using the Solexa platform (27 bp reads). High-quality reads were mapped to the GRCh37/hg19 human genome using Bowtie 0.12.5 (Eland was used to map to NCBI36/hg18); only unique mappings were kept. DNaseI sensitivity is directly reflected in raw tag density (Signal), which is shown in the track as density of tags mapping within a 150 bp sliding window (at a 20 bp step across the genome). DNaseI hypersensitive zones (HotSpots) were identified using the HotSpot algorithm described in Sabo et al. (2004). False discovery rate thresholds of 1.0% (FDR 0.01) were computed for each cell type by applying the HotSpot algorithm to an equivalent number of random uniquely mapping 36-mers. DNaseI hypersensitive sites (DHSs or Peaks) were identified as signal peaks within 1.0% (FDR 0.01) hypersensitive zones using a peak-finding algorithm. Only DNase Solexa libraries from unique cell types producing the highest quality data, as defined by Percent Tags in Hotspots (PTIH ~40%) were designated for deep sequencing to a depth of over 200 million tags.

Project description:This track is produced as part of the mouse ENCODE Project. This track shows DNaseI sensitivity measured genome-wide in mouse tissues and cell lines using the Digital DNaseI methodology (see below), and DNaseI hypersensitive sites. DNaseI has long been used to map general chromatin accessibility and DNaseI hypersensitivity is a universal feature of active cis-regulatory sequences. The use of this method has led to the discovery of functional regulatory elements that include enhancers, insulators, promotors, locus control regions and novel elements. For each experiment (tissue/cell type) this track shows DNaseI sensitivity as a continuous function using sequencing tag density (Signal), and discrete loci of DNaseI sensitive zones (HotSpots) and hypersensitive sites (Peaks). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell/mouse). Fresh tissues were harvested from mice and the nuclei prepared according to the tissue appropriate protocol (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell/mouse). Digital DNaseI was performed by DNaseI digestion of intact nuclei, isolating DNaseI 'double-hit' fragments as described in Sabo et al. (2006), and direct sequencing of fragment ends (which correspond to in vivo DNaseI cleavage sites) using the Illumina IIx (and Illumina HiSeq by early 2011) platform (36 bp reads). Uniquely mapping high-quality reads were mapped to the genome using the bowtie aligner. DNaseI sensitivity is directly reflected in raw tag density, which is shown in the track as density of tags mapping within a 150 bp sliding window (at a 20 bp step across the genome). DNaseI sensitive zones (HotSpots) were identified using the HotSpot algorithm described in Sabo et al. (2004). 1.0% false discovery rate thresholds (FDR 0.01) were computed for each cell type by applying the HotSpot algorithm to an equivalent number of random uniquely mapping 36mers. DNaseI hypersensitive sites (DHSs or Peaks) were identified as signal peaks within FDR 1.0% hypersensitive zones using a peak-finding algorithm (I-max).

Dataset Information

DNaseI Digital Genomic Footprinting from ENCODE/University of Washington [Mouse]

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets