Project description:Understanding the regulatory genome remains a significant challenge. Annotation of regulatory elements and identification of the transcription factors (TFs) targeting these elements are key steps in understanding how a given cell interprets its genetic blueprint. One goal of the modENCODE (model organism Encyclopedia of DNA Elements) project is to survey a diverse sampling of TFs, both DNA-binding and non-DNA binding factors, to provide a framework for the subsequent study of the mechanisms by which transcriptional regulators target the genome. Here we provide an updated map of the Drosophila melanogaster regulatory genome based on the location of 84 TFs at various stages of development. This regulatory map reveals a variety of genomic targeting patterns, including factors with strong preferences toward proximal promoter binding, factors that target intergenic and intronic DNA, and factors with distinct chromatin state preferences. The data also suggest the existence of a partially self-contained Polycomb regulatory network, and highlight the importance of Trithorax-like (Trl) in maintaining hotspots of DNA binding throughout development. Furthermore, the data identify over 5,800 instances in which TFs target DNA regions with demonstrated enhancer activity. Regions of high TF co-occupancy are more likely to be associated with open enhancers used across cell types, while lower TF occupancy regions are associated with complex enhancers that are also regulated at the epigenetic level. A putative regulatory network generated based on these 84 regulators reveals hundreds of co-binding events, thousands of potential regulatory interactions, and distinct regulatory strategies at developmental and housekeeping genes. These data serve as a resource for the research community in the continued effort to dissect transcriptional regulatory mechanisms directing Drosophila development. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf This is a dataset generated by the Drosophila Regulatory Elements modENCODE Project led by Kevin P. White at the University of Chicago.
Project description:The model organism Encyclopedia of DNA Elements project (modENCODE) has produced a comprehensive annotation of D. melanogaster transcript models based on an enormous amount of high-throughput experimental data. However, some transcribed elements may not be functional, and technical artifacts may lead to erroneous inference of transcription. Inter-species comparison provides confidence to predicted annotation, since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function. We have performed RNA-Seq and CAGE-Seq experiments on more than 80 samples from multiple tissues and stages of 15 Drosophila species, including 8 previously unsequenced genomes. We have found strikingly conserved sequence, expression, and splicing for the vast majority of transcript models in modENCODE annotation (e.g. 99% exons of coding sequences (CDS), 88% exons of untranslated regions (UTR), and 87% splicing events), indicating that the transcriptome annotation is of very high quality. We also describe dynamic transcriptome evolution within the Drosophila genus, including conserved promoter structure, labile positions of transcription start sites, and rapidly evolving RNA-editing events. We demonstrate how this phylogenetic approach to DNA element validation will prove useful in the annotation of other high priority genomes, especially for genomes that are less compact than Drosophila (e.g. the vast majority of vertebrate genomes). Refer to individual Series (listed below).
Project description:The model organism Encyclopedia of DNA Elements project (modENCODE) has produced a comprehensive annotation of D. melanogaster transcript models based on an enormous amount of high-throughput experimental data. However, some transcribed elements may not be functional, and technical artifacts may lead to erroneous inference of transcription. Inter-species comparison provides confidence to predicted annotation, since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function. We have performed RNA-Seq and CAGE-Seq experiments on more than 80 samples from multiple tissues and stages of 15 Drosophila species, including 8 previously unsequenced genomes. We have found strikingly conserved sequence, expression, and splicing for the vast majority of transcript models in modENCODE annotation (e.g. 99% exons of coding sequences (CDS), 88% exons of untranslated regions (UTR), and 87% splicing events), indicating that the transcriptome annotation is of very high quality. We also describe dynamic transcriptome evolution within the Drosophila genus, including conserved promoter structure, labile positions of transcription start sites, and rapidly evolving RNA-editing events. We demonstrate how this phylogenetic approach to DNA element validation will prove useful in the annotation of other high priority genomes, especially for genomes that are less compact than Drosophila (e.g. the vast majority of vertebrate genomes).