Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Terry Furey mailto:tsfurey@duke.edu). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). These tracks display Formaldehyde-Assisted Isolation of Regulatory Elements (FAIRE) evidence as part of the four Open Chromatin track sets. FAIRE is a method to isolate and identify nucleosome-depleted regions of the genome. FAIRE was initially discovered in yeast and subsequently shown to identify active regulatory elements in human cells (Giresi et al., 2007). Similar to DNaseI HS, FAIRE appears to identify functional regulatory elements that include promoters, enhancers, silencers, insulators, locus control regions and novel elements. Together with DNaseI HS and ChIP-seq experiments, these tracks display the locations of active regulatory elements identified as open chromatin in multiple cell types (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=cellType) from the Duke, UNC-Chapel Hill, UT-Austin, and EBI ENCODE group. Within this project, open chromatin was identified using two independent and complementary methods: DNaseI hypersensitivity (HS) and these FAIRE assays, combined with chromatin immunoprecipitation (ChIP) for select regulatory factors. DNaseI HS and FAIRE provide assay cross-validation with commonly identified regions delineating the highest confidence areas of open chromatin. ChIP assays provide functional validation and preliminary annotation of a subset of open chromatin sites. Each method employed Illumina (formerly Solexa) sequencing by synthesis as the detection platform. The Tier 1 and Tier 2 cell types were additionally verified by a second platform, high-resolution 1% ENCODE tiled microarrays supplied by NimbleGen. Release 1 (March 2011) of this track consists of a remapping of all previously released experiments to the human reference genome GRCh37/hg19 (these data were previously mapped to NCBI36/hg18; please see the Release Notes section of the hg18 Open Chromatin track (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&g=wgEncodeChromatinMap) for information on the NCBI36/hg18 releases of the data). -There are 12 new FAIRE experiments in this release, on 10 new cell lines. -New to this release is a reconfiguration of how this track is displayed in relation to other tracks from the Duke/UNC/UT-Austin/EBI group. -A synthesis of open chromatin evidence from the three assay types was compiled for Tier 1 and 2 cell lines plus NHEK will also be added in this release and can be previewed in: Open Chromatin Synthesis (http://genome-preview.cse.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeOpenChromSynth). -Enhancer and Insulator Functional assays: A subset of DNase and FAIRE regions were cloned into functional tissue culture reporter assays to test for enhancer and insulator activity. Coordinates and results from these experiments can be found at http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeOpenChromFaire/supplemental/. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf

Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Terry Furey mailto:tsfurey@duke.edu). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). These tracks display Formaldehyde-Assisted Isolation of Regulatory Elements (FAIRE) evidence as part of the four Open Chromatin track sets. FAIRE is a method to isolate and identify nucleosome-depleted regions of the genome. FAIRE was initially discovered in yeast and subsequently shown to identify active regulatory elements in human cells (Giresi et al., 2007). Similar to DNaseI HS, FAIRE appears to identify functional regulatory elements that include promoters, enhancers, silencers, insulators, locus control regions and novel elements. Together with DNaseI HS and ChIP-seq experiments, these tracks display the locations of active regulatory elements identified as open chromatin in multiple cell types (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?type=cellType) from the Duke, UNC-Chapel Hill, UT-Austin, and EBI ENCODE group. Within this project, open chromatin was identified using two independent and complementary methods: DNaseI hypersensitivity (HS) and these FAIRE assays, combined with chromatin immunoprecipitation (ChIP) for select regulatory factors. DNaseI HS and FAIRE provide assay cross-validation with commonly identified regions delineating the highest confidence areas of open chromatin. ChIP assays provide functional validation and preliminary annotation of a subset of open chromatin sites. Each method employed Illumina (formerly Solexa) sequencing by synthesis as the detection platform. The Tier 1 and Tier 2 cell types were additionally verified by a second platform, high-resolution 1% ENCODE tiled microarrays supplied by NimbleGen. Release 1 (March 2011) of this track consists of a remapping of all previously released experiments to the human reference genome GRCh37/hg19 (these data were previously mapped to NCBI36/hg18; please see the Release Notes section of the hg18 Open Chromatin track (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&g=wgEncodeChromatinMap) for information on the NCBI36/hg18 releases of the data). -There are 12 new FAIRE experiments in this release, on 10 new cell lines. -New to this release is a reconfiguration of how this track is displayed in relation to other tracks from the Duke/UNC/UT-Austin/EBI group. -A synthesis of open chromatin evidence from the three assay types was compiled for Tier 1 and 2 cell lines plus NHEK will also be added in this release and can be previewed in: Open Chromatin Synthesis (http://genome-preview.cse.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeOpenChromSynth). -Enhancer and Insulator Functional assays: A subset of DNase and FAIRE regions were cloned into functional tissue culture reporter assays to test for enhancer and insulator activity. Coordinates and results from these experiments can be found at http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeOpenChromFaire/supplemental/. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Cells were grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell). FAIRE was performed (Giresi et al., 2007) by cross-linking proteins to DNA using 1% formaldehyde solution, and the complex was sheared using sonication. Phenol/chloroform extractions were performed to remove DNA fragments cross-linked to protein. The DNA recovered in the aqueous phase was sequenced using an Illumina (Solexa) sequencing system. FAIRE-seq data for Tier 1 and Tier 2 cell lines were verified by comparing multiple independent growths (replicates) and determining the reproducibility of the data. For some cell types, additional verification was performed using the same material but hybridized to NimbleGen Human ENCODE tiling arrays (1% of the genome) along with the input DNA as reference (FAIRE-chip). A more detailed protocol is available at http://hgwdev.cse.ucsc.edu/ENCODE/protocols/general/FAIRE_UNC_procedure.pdf. Also see Giresi et al., 2009. DNA fragments isolated by FAIRE are 100-200 bp in length, with the average length being 134 bp. Sequences from each experiment were aligned to the genome using BWA (Li et al., 2010) for the NCBI 36 (hg19) assembly. The command used for these alignments was: > bwa aln -t 8 genome.fa s_1.sequence.txt.bfq > s_1.sequence.txt.sai Where genome.fa is the whole genome sequence and s_1.sequence.txt.bfq is one lane of sequences converted into the required bfq format. Sequences from multiple lanes are combined for a single replicate using the bwa samse command, and converted in the sam/bam format using samtools. Only those that aligned to 4 or fewer locations were retained. Other sequences were also filtered based on their alignment to problematic regions (such as satellites and rRNA genes - see supplemental materials http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeOpenChromFaire/supplemental/). The mappings of these short reads to the genome are available for download at http://hgwdev.cse.ucsc.edu/cgi-bin/hgFileUi?g=wgEncodeOpenChromFaire. The resulting digital signal was converted to a continuous wiggle track using F-Seq that employs Parzen kernel density estimation to create base pair scores (Boyle et al., 2008b). Input data has been generated for several cell lines. These are used directly to create a control/background model used for F-Seq when generating signal annotations for these cell lines. These models are meant to correct for sequencing biases, alignment artifacts, and copy number changes in these cell lines. Input data is not being generated directly for other cell lines. Instead, a general background model was derived from the available Input data sets. This should provide corrections for sequencing biases and alignment artifacts, but will not correct for cell type specific copy number changes. The exact command used for this step is: > fseq -l 800 -v -b <bff files> -p <iff files> aligments.bed Where the (bff files) are the background files based on alignability, the (iff files) are the background files based on the Input experiments, and alignments.bed are a bed file of filtered sequence alignments. Discrete FAIRE sites (peaks) were identified from FAIRE-seq F-seq density signal. Significant regions were determined by fitting the data to a gamma distribution to calculate p-values. Contiguous regions data to a gamma distribution to calculate p-values. Contiguous regions where p-values were below a 0.05/0.01 threshold were considered significant. Data from the high-resolution 1% ENCODE tiled microarrays supplied by NimbleGen were normalized using the Tukey biweight normalization, and peaks were called using ChIPOTle (Buck, et al., 2005) at multiple levels of significance. Regions matched on size to these peaks that were devoid of any significant signal were also created as a null model. These data were used for additional verification of Tier 1 and Tier 2 cell lines by ROC analysis. Files containing this data can be found in the Downloads directory labeled Validation view (http://hgwdev.cse.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeOpenChromFaire).

Project description:The human body contains thousands of unique cell types, each with specialized functions. Cell identity is governed in large part by gene transcription programs, which are determined by regulatory elements encoded in DNA. To identify regulatory elements active in seven cell lines representative of diverse human cell types, we used DNase-seq and FAIRE-seq to map M-CM-"M-BM-^@M-BM-^\open chromatinM-CM-"M-BM-^@M-BM-^]. Over 870,000 DNaseI or FAIRE sites, which correspond largely to nucleosome depleted regions (NDRs), were identified across the seven cell lines, covering nearly 9% of the genome. The combination of DNaseI and FAIRE is more effective than either assay alone in identifying likely regulatory elements, as judged by coincidence with transcription factor binding locations determined in the same cells. Open chromatin common to all seven cell types tended to be at or near transcription start sites and encompassed more CTCF binding sites, while open chromatin sites found in only one cell type were typically located away from transcription start sites, and contained DNA motifs recognized by regulators of cell-type identity. As one example of its ability to identify functional DNA, we show that open chromatin regions bound by CTCF are potent insulators. We identified clusters of open regulatory elements (COREs) that were physically near each other and whose appearance was coordinated among one or more cell types. Gene expression and RNA Pol II binding data support the hypothesis that COREs control gene activity required for the maintenance of cell-type identity. This publicly available atlas of regulatory elements may prove valuable in identifying non-coding DNA sequence variants that are causally linked to human disease. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNase-seq, FAIRE-seq, and ChIP-seq were performed on seven human cell lines: GM12878 (lymphoblastoid), K562 (leukemia), HepG2 (hepatocellular carcinoma), HelaS3 (cervical carcinoma), HUVEC (human umbilical vein endothelial cells), NHEK (keratinocytes), and H1-ES (embryonic stem cells). For each cell line, two or three replicates were independently grown and split into three, one for each of the three experimental methods. Control ChIP experiments were performed on five of the cell lines with NHEK and H1-ES being excluded due to lack of material.

			Action	DRS
	ERR707114.fastq.gz	Fastqsanger.gz
	ERR707115.fastq.gz	Fastqsanger.gz
	ERR707116.fastq.gz	Fastqsanger.gz

Dataset Information

Open chromatin in human pancreatic islets (FAIRE-seq) - Appendix to E-GEOD-17616

Dataset's files

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets