Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity [FAIRE_seq]
Ontology highlight
ABSTRACT: The human body contains thousands of unique cell types, each with specialized functions. Cell identity is governed in large part by gene transcription programs, which are determined by regulatory elements encoded in DNA. To identify regulatory elements active in seven cell lines representative of diverse human cell types, we used DNase-seq and FAIRE-seq to map M-CM-"M-BM-^@M-BM-^\open chromatinM-CM-"M-BM-^@M-BM-^]. Over 870,000 DNaseI or FAIRE sites, which correspond largely to nucleosome depleted regions (NDRs), were identified across the seven cell lines, covering nearly 9% of the genome. The combination of DNaseI and FAIRE is more effective than either assay alone in identifying likely regulatory elements, as judged by coincidence with transcription factor binding locations determined in the same cells. Open chromatin common to all seven cell types tended to be at or near transcription start sites and encompassed more CTCF binding sites, while open chromatin sites found in only one cell type were typically located away from transcription start sites, and contained DNA motifs recognized by regulators of cell-type identity. As one example of its ability to identify functional DNA, we show that open chromatin regions bound by CTCF are potent insulators. We identified clusters of open regulatory elements (COREs) that were physically near each other and whose appearance was coordinated among one or more cell types. Gene expression and RNA Pol II binding data support the hypothesis that COREs control gene activity required for the maintenance of cell-type identity. This publicly available atlas of regulatory elements may prove valuable in identifying non-coding DNA sequence variants that are causally linked to human disease. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNase-seq, FAIRE-seq, and ChIP-seq were performed on seven human cell lines: GM12878 (lymphoblastoid), K562 (leukemia), HepG2 (hepatocellular carcinoma), HelaS3 (cervical carcinoma), HUVEC (human umbilical vein endothelial cells), NHEK (keratinocytes), and H1-ES (embryonic stem cells). For each cell line, two or three replicates were independently grown and split into three, one for each of the three experimental methods. Control ChIP experiments were performed on five of the cell lines with NHEK and H1-ES being excluded due to lack of material.
ORGANISM(S): Homo sapiens
SUBMITTER: Terry Furey
PROVIDER: E-GEOD-30225 | biostudies-arrayexpress |
REPOSITORIES: biostudies-arrayexpress
ACCESS DATA