Project description:A catalog of transcription factor (TF) binding sites in the genome is critical for deciphering regulatory relationships. Here we present the culmination of the efforts of the modENCODE (Model Organism ENCyclopedia Of DNA Elements) and modERN (model organism Encyclopedia of Regulatory Networks) consortia to systematically assay TF binding events in vivo in two major model organisms, Drosophila melanogaster (fly) and Caenorhabditis elegans (worm). These datasets comprise 605 TFs identifying 3.6M sites in the fly and 356 TFs identifying 0.9 M sites in the worm, and represent the majority of the regulatory space in each genome. We demonstrate that TFs associate with chromatin in clusters termed “metapeaks”, that larger metapeaks have characteristics of high occupancy target (HOT) regions, and that the importance of consensus sequence motifs bound by TFs depends on metapeak size and complexity. Combining ChIP-seq data with single cell RNA-seq data in a machine learning model identifies particular TFs with a prominent role in promoting target gene expression in specific cell types, even differentiating between parent-daughter cells during embryogenesis. These data are a rich resource for the community that should fuel and guide future investigations into TF function. To facilitate data accessibility and utility, all strains expressing GFP-tagged TFs are available at the stock centers for each organism. The chromatin immunoprecipitation sequencing data are available through the ENCODE Data Coordinating Center, GEO, and through a direct interface (http://epic.gs.washington.edu/modERN/) that provides rapid access to processed data sets and summary analyses, as well as widgets to probe the cell type-specific TF-target relationships.
Project description:A catalog of transcription factor (TF) binding sites in the genome is critical for deciphering regulatory relationships. Here we present the culmination of the efforts of the modENCODE (Model Organism ENCyclopedia Of DNA Elements) and modERN (model organism Encyclopedia of Regulatory Networks) consortia to systematically assay TF binding events in vivo in two major model organisms, Drosophila melanogaster (fly) and Caenorhabditis elegans (worm). These datasets comprise 605 TFs identifying 3.6M sites in the fly and 356 TFs identifying 0.9 M sites in the worm, and represent the majority of the regulatory space in each genome. We demonstrate that TFs associate with chromatin in clusters termed “metapeaks”, that larger metapeaks have characteristics of high occupancy target (HOT) regions, and that the importance of consensus sequence motifs bound by TFs depends on metapeak size and complexity. Combining ChIP-seq data with single cell RNA-seq data in a machine learning model identifies particular TFs with a prominent role in promoting target gene expression in specific cell types, even differentiating between parent-daughter cells during embryogenesis. These data are a rich resource for the community that should fuel and guide future investigations into TF function. To facilitate data accessibility and utility, all strains expressing GFP-tagged TFs are available at the stock centers for each organism. The chromatin immunoprecipitation sequencing data are available through the ENCODE Data Coordinating Center, GEO, and through a direct interface (http://epic.gs.washington.edu/modERN/) that provides rapid access to processed data sets and summary analyses, as well as widgets to probe the cell type-specific TF-target relationships.
Project description:ChIP-seq study analysing adult Drosophila melanogaster head, glial, neuronal and fat body, as well as embryonic RNA pol II and H2A.v binding by employing the GAL4-UAS system to generate GFP-fusion proteins and ChIP-seq
Project description:Control of RNA transcription is critical for the development and homeostasis of all organisms, and can occur at multiple steps of the transcription cycle, including RNA polymerase II (Pol II) recruitment, initiation, promoter-proximal pausing, and elongation. That Pol II accumulates on many promoters in metazoans implies that steps other than Pol II recruitment are rate-limiting and regulated 1-6. By integrating genome-wide Pol II chromatin immunoprecipition (ChIP) and Global Run-On (GRO) genomic data sets from Drosophila cells, we examined critical features of Pol II near promoters. The accumulation of promoter-proximal polymerase is widespread, occurring on 70% of active genes; and unlike elongating Pol II within the body of genes, promoter Pol II are held paused by factors like NELF, unable to transcribe unless nuclei are treated with strong detergent. Notably, we find that the vast majority of promoter-proximal Pol II detected by ChIP are paused, thereby identifying the biochemical nature of this rate-limiting step in transcription. Finally, we demonstrate that Drosophila promoters do not have the upstream divergent Pol II that is seen so broadly and prominently on mammalian promoters. We postulate this is a consequence of Drosophila’s extensive use of directional core promoter sequence elements, which contrasts with mammals’ lack of directional elements and prevalence of CpG island core promoters. In support of this idea, we show that the fraction of mammalian promoters containing a TATA box core element is dramatically depleted of upstream divergent transcription. ChIP-seq data set for Pol II (rpb3) (2 replicates).
Project description:Control of RNA transcription is critical for the development and homeostasis of all organisms, and can occur at multiple steps of the transcription cycle, including RNA polymerase II (Pol II) recruitment, initiation, promoter-proximal pausing, and elongation. That Pol II accumulates on many promoters in metazoans implies that steps other than Pol II recruitment are rate-limiting and regulated 1-6. By integrating genome-wide Pol II chromatin immunoprecipition (ChIP) and Global Run-On (GRO) genomic data sets from Drosophila cells, we examined critical features of Pol II near promoters. The accumulation of promoter-proximal polymerase is widespread, occurring on 70% of active genes; and unlike elongating Pol II within the body of genes, promoter Pol II are held paused by factors like NELF, unable to transcribe unless nuclei are treated with strong detergent. Notably, we find that the vast majority of promoter-proximal Pol II detected by ChIP are paused, thereby identifying the biochemical nature of this rate-limiting step in transcription. Finally, we demonstrate that Drosophila promoters do not have the upstream divergent Pol II that is seen so broadly and prominently on mammalian promoters. We postulate this is a consequence of Drosophila’s extensive use of directional core promoter sequence elements, which contrasts with mammals’ lack of directional elements and prevalence of CpG island core promoters. In support of this idea, we show that the fraction of mammalian promoters containing a TATA box core element is dramatically depleted of upstream divergent transcription.