Project description:Here, we demonstrate that Nematostella vectensis, Ciona intestinalis, Apis mellifera, and B. mori, show two distinct populations of genes differentiated by gene-body CpG density. Genome-scale DNA methylation profiles for A. mellifera spermatozoa reveal CpG-poor genes are methylated in the germ line, as predicted by the depletion of CpGs. We find an evolutionarily conserved distinction between CpG-poor and -rich genes: the former are associated with basic biological processes, the latter with more specialized functions. This distinction is strikingly similar to that recently observed between euchromatin-associated genes in Drosophila that contain intragenic histone 3 lysine 36 trimethylation (H3K36me3) and those that do not, even though Drosophila doesnM-CM-"M-BM-^@M-BM-^Yt display CpG density bimodality or methylation. We confirm that a significant number of CpG-poor genes in N. vectensis, C. intestinalis, A. mellifera and B. mori are orthologs of H3K36me3- rich genes in Drosophila. We propose that over evolutionary time, gene-body H3K36me3 has influenced gene-body DNA methylation levels, and consequently the gene-body CpG density bimodality characteristic of invertebrates that harbor CpG methylation. Examination of DNA methylation in Apis Mellifera sperm
Project description:Chromosome conformation capture (4C-Seq) in Drosophila embryos from a wild-type line and from transgenic fly lines carrying the E3 enhancer of twist at ectopic locations. Two time points (2-5 hrs and 5-8 hrs after egg lay) and two viewpoints located near the twist promoter were assayed. Two independent collections were performed at each timepoint and each viewpoint.
Project description:The immunoglobulin heavy-chain (Igh) locus undergoes large-scale contraction in pro-B cells, which facilitates VH-DJH recombination by juxtaposing distal VH genes next to the DJH- rearranged gene segment in the proximal Igh domain. By high-resolution mapping of long-range interactions, we now demonstrate that an array of local interaction domains establishes the three- dimensional structure of the extended Igh locus in lymphoid progenitors and thymocytes. In pro- B cells, these local domains engage in long-range interactions across the entire Igh locus, which depend on the transcription factors Pax5, YY1 and CTCF. The large VH gene cluster thereby undergoes flexible long-range interactions with the more rigidly structured 3M-bM-^@M-^Y proximal domain, which ensures that all VH genes can participate with similar probability in VH-DJH recombination to generate a diverse antibody repertoire. Notably, these long-range interactions appear to be an intrinsic feature of the VH gene cluster, as they are still generated upon mutation of the EM-NM-< enhancer, IGCR1 insulator or 3M-bM-^@M-^Y regulatory region present in the 3M-bM-^@M-^Y proximal Igh domain. 4C sequencing from mutliple celltypes with multiple viewpoints; uneven number of replicates ChIP-Seq
Project description:In many metazons, such as humans and Drosophila, homeodomain proteins comprise the second largest family of sequence specific transcription factors. In Drosophila, homeodomains play an important role in development. Many homeodomain proteins display a high level of homology across metazons, presumably due to importance of their functional roles. We comprehensively characterized the DNA binding preferences of all 84 Drosophila homeodomain transcription factors which contain a single DNA binding domain. Previously, we employed a bacterial one hybrid (B1H) assay to select for 20 to 40 high affinity transcription factor binding sites [Noyes et al. (2008). Cell. 133(7):1277-1289]. In this system, E. coli are transfected with two plasmids. One plasmid encodes the DNA binding domain of a homeodomain fused to two zinc finger domains and the omega subunit of RNAP. The other plasmid is drawn from a library of prey plasmids which contain a 10bp randomized transcription factor binding site (TFBS) region in the promoter of the reporter gene His3. The E. coli strain used is a His3 homolog and omega subunit knock out strain. If a transcription factor has high affinity for a TFBS, more His3 will be produced, leading to the production of more histidine and an increase in the growth rate. If the transcription factor does not bind with sufficient affinity to the TFBS, little or no histidine will be produced resulting in little or no growth. The stringency of the B1H system can be tuned using the chemicals IPTG and 3-AT. IPTG induces production of the chimeric transcription factor, and 3-AT is a competitive inhibitor of the enzyme encoded by His3. One of the advantages of the B1H system is that the transcription factor does not have to be purified and that many experiments can be easily conducted in parallel. In this study, in stead of picking 20 to 40 colonies and sequencing their TFBSs, we used high-throughput Illumina sequencing to sequence the selected sites of all of the colonies growing on a plate. This provided quantitative data regarding the growth rate of cells possessing each selected TFBS variant, which is a function of the affinity of the transcription factor for the binding site. With this quantitative data, we can build more accurate models of transcription factor binding. All 84 of the Drosophila homeodomain proteins that contain a single DNA binding domain were analyzed using the B1H assay. The same selection stringency was used for all experiments (10uM IPTG and 5mM 3-AT). All experiments were run for 36 to 48 hours. 10 mutants were also assayed: 3 Caup, 3 Bcd and 4 En mutants. The bait plasmid omegaUV2zf was used in all but 3 cases. In this instances, a slightly different bait plasmid, omegaUV5zf, with a stronger promoter was used. Thirty different replicates were performed in order to insure that sufficient number of reads were obtained for each protein. In total, 126 experiments were performed.
Project description:Chromosomes are the physical realization of genetic information and thus form the basis for its reading, hindering and propagation. Here we present a high-resolution chromosomal contact map derived from a new genome-wide chromosome conformation capture approach (a simplified version of the Hi-C method) applied to Drosophila embryonic nuclei. The data show that the entire genome is linearly partitioned into well-demarcated physical domains that overlap extensively with active and repressive epigenetic marks. Chromosomal contacts are hierarchically organized between domains. Global modeling of compaction and clustering of domains show that inactive domains are condensed and confined to their chromosomal territories, while active domains reach out of the territory to form remote intra- and inter-chromosomal contacts. One pilot sample, comprising seven paired-end Illumina GA-II lanes; one deep-sequenced sample, comprising seven paired-end Illumina HiSeq lanes
Project description:Chromatin accessibility is a key determinant of cell-type-specific gene expression. Here, we have investigated the chromatin architecture of different acute myeloid leukemia (AML) cells and the changes in accessibility when NB4 (APL) cells undergo the process of differentiation. For nuclease-accessible site sequencing (NA-seq; Gargiulo et al. 2009), chromatin-accessible libraries were generated in different AML leukemic cells by using restriction enzymes NlaIII and HpaII. In the case of NB4 cells, accessibility was mapped both before and after treatment with all-trans retinoic acid (ATRA) for 48hr. Differences were observed between the two conditions, and chromatin accessibility was correlated with underlying epigenetic modifications. For validation purposes, NA-seq libraries (using the NlaIII enzyme) were generated in APL and AML M1 patient's blasts. All of the ChIP-seq (Martens et al. 2010) studies were performed in leukemic NB4 and SKNO-1 cells. Supplementary file 'GSE30254_All_accessibleregions_ATRA_NB4_fseq.wig' includes data for Samples GSM749512, GSM749513, GSM749516, and GSM749517. Supplementary file 'GSE30254_All_accessibleregions_untreated_NB4_fseq.wig' includes data for Samples GSM749510, GSM749511, GSM749514, and GSM749515.
Project description:The 3' untranslated region (3'UTR) constitutes a major site of post-transcriptional regulation of gene expression. Sequence elements in the 3'UTR interact with trans-acting regulators such as microRNAs that affect translation and stability. The overall aim is to use a 3'RACE cloning-sequencing stragety to identify the 3'UTRs of C. elegans transcripts and explore their heterogeneity in different developmental stages and tissues. Keywords: Transcriptome analysis For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf Raw data files are available on our FTP site: ftp://ftp.ncbi.nlm.nih.gov/pub/geosup/Series/GSE17781 pilot study [GSM443959..GSM443964]: N2 wildtype worms staged at embryo, L1, L2, L3, L4, and adult full experiment [GSM446651..GSM446661]: N2 wildtype worms staged at embryo, L1, L2, L3, L4, dauer, and adult. Illumina Genome Analyzer sequencing of isolated clones [GSM469439] 454 sequencing of RACE clones [GSM469976]
Project description:Exploiting the full potential of insertional mutagenesis screens with retroviruses and transposons requires methods for distinguishing clonal from subclonal insertion events within heterogeneous tumor cell populations. Current protocols, based on ligation mediated PCR, depend on endonuclease based fragmentation of genomic DNA, resulting in strong biases in amplification and sequencing due to a fixed product sizes of the amplicon. We have developed a method called shear-splink, which enables the semi-quantitative high-throughput sequence analysis of insertional mutations, enabling us to count the number of cells harboring a given integration, within a heterogeneous sample. The shear-splink method enriches for (sub)clonal integrations, thereby reducing the contribution of irrelevant passenger mutations normally hampering a reliable identification of common integration sites. Additionally, this improvement allows us to identify genetic interactions between affected genes, co-occurring mutations and to study acquired resistance mechanisms both in vivo and in vitro. Sequencing of retrovrial integration sites by LM-PCR. The associated manuscript describes a new method to quantitatively determine retrovrial integration sites using an improved ligation-mediated PCR approach and subsequent 454 pyrosequencing. [GSM562151 to GSM562159]: Sequence data from different mixtures of 2 different cell lines (called AE6 and BB12) which are processed without a restriction enzyme. These cell lines are derived from an MMTV induced mammary tumor, for which we amplify the MMTV integration sites using a ligation-mediated PCR setup. We mixed these 2 cell lines, both with a different integration spectrum, to determine whether our amplification and sequencing protocol is quantitative, meaning that the coverage per integration site is decreasing upon a further dilution of the sample. [GSM641935 to GSM641950]: Unique Sleeping beauty induced lymphoma specimens (spleen) obtained from a cohort of 16 wild-type mice with the 129P2/C57BL/6J mixed background. [GSM776576 to GSM776956]: The 379 submitted specimens are originating from 127 unique leukemia/lymphoma samples, processed using 3 different techniques in order to identify Sleeping Beauty integration sites. We compared restriction enzyme based LM-PCR (RE-splink) with shearing based LM-PCR (shear-splink) on 127 unique Sleeping Beauty (SB) induced leukemia's/lymphomas. All sequence data generated by the 454 sequencing platform are submitted to GEO, including the final output of our sequence analysis pipeline (in bed format; see Supplementary files linked below). Previous submissions contained similar sequence information (integration sites of viruses or transposons driving tumorigenesis) and are all part of the same manuscript.
Project description:Alternative splicing—the production of multiple mRNA isoforms from a single gene—is regulated in part by RNA-binding proteins (RBPs). While the RBPs Tra2? and Tra2? have both been implicated in the regulation of alternative splicing, their relative contribution to this process are not well understood. Here we use iCLIP to identify Tra2? target exons in MDA-MB-231 cells. We find that simultaneous—but not individual—depletion of Tra2? and Tra2? induces substantial shifts in the splicing pattern of endogenous Tra2? target exons identified by iCLIP. We next use RNA-seq following joint Tra2 protein depletion to comprehensively identify Tra2 protein-dependent exons in MDA-MB-231 cells. Endogenous Tra2? binding sites were mapped across the MDA-MB-231 cell transcriptome in biological triplicate iCLIP experiments. RNA-seq was performed using three biological replicates of negative control siRNA treated MDA-MB-231 cells and three biological replicates of TRA2A and TRA2B siRNA treated MDA-MB-231 cells.