Project description:A catalog of transcription factor (TF) binding sites in the genome is critical for deciphering regulatory relationships. Here we present the culmination of the efforts of the modENCODE (Model Organism ENCyclopedia Of DNA Elements) and modERN (model organism Encyclopedia of Regulatory Networks) consortia to systematically assay TF binding events in vivo in two major model organisms, Drosophila melanogaster (fly) and Caenorhabditis elegans (worm). These datasets comprise 605 TFs identifying 3.6M sites in the fly and 356 TFs identifying 0.9 M sites in the worm, and represent the majority of the regulatory space in each genome. We demonstrate that TFs associate with chromatin in clusters termed “metapeaks”, that larger metapeaks have characteristics of high occupancy target (HOT) regions, and that the importance of consensus sequence motifs bound by TFs depends on metapeak size and complexity. Combining ChIP-seq data with single cell RNA-seq data in a machine learning model identifies particular TFs with a prominent role in promoting target gene expression in specific cell types, even differentiating between parent-daughter cells during embryogenesis. These data are a rich resource for the community that should fuel and guide future investigations into TF function. To facilitate data accessibility and utility, all strains expressing GFP-tagged TFs are available at the stock centers for each organism. The chromatin immunoprecipitation sequencing data are available through the ENCODE Data Coordinating Center, GEO, and through a direct interface (http://epic.gs.washington.edu/modERN/) that provides rapid access to processed data sets and summary analyses, as well as widgets to probe the cell type-specific TF-target relationships.
Project description:A catalog of transcription factor (TF) binding sites in the genome is critical for deciphering regulatory relationships. Here we present the culmination of the efforts of the modENCODE (Model Organism ENCyclopedia Of DNA Elements) and modERN (model organism Encyclopedia of Regulatory Networks) consortia to systematically assay TF binding events in vivo in two major model organisms, Drosophila melanogaster (fly) and Caenorhabditis elegans (worm). These datasets comprise 605 TFs identifying 3.6M sites in the fly and 356 TFs identifying 0.9 M sites in the worm, and represent the majority of the regulatory space in each genome. We demonstrate that TFs associate with chromatin in clusters termed “metapeaks”, that larger metapeaks have characteristics of high occupancy target (HOT) regions, and that the importance of consensus sequence motifs bound by TFs depends on metapeak size and complexity. Combining ChIP-seq data with single cell RNA-seq data in a machine learning model identifies particular TFs with a prominent role in promoting target gene expression in specific cell types, even differentiating between parent-daughter cells during embryogenesis. These data are a rich resource for the community that should fuel and guide future investigations into TF function. To facilitate data accessibility and utility, all strains expressing GFP-tagged TFs are available at the stock centers for each organism. The chromatin immunoprecipitation sequencing data are available through the ENCODE Data Coordinating Center, GEO, and through a direct interface (http://epic.gs.washington.edu/modERN/) that provides rapid access to processed data sets and summary analyses, as well as widgets to probe the cell type-specific TF-target relationships.
Project description:This project aims to identify novel RNA binding proteins in the nematode, Caenorhabditis elegans. Since interactions between RNAs and proteins may be transient, these animals were crosslinked with UV light at 254 nm which promotes the covalent link between proteins and RNAs. After this, polyadenylated mRNAs were purified via oligo(dT) coupled to magentic beads under stringent conditions. Finally, samples were subjected to mass spectrometry analysis. To rule out the possibility of RNA-independent binding we also analysed other samples: i) samples digested with RNase one; ii) samples where we performed competition assays with polyadenylic acid