Dual threshold optimization and network inference reveal convergent evidence from TF binding locations and TF perturbation responses
Ontology highlight
ABSTRACT: A high-confidence map of the direct, functional targets of each transcription factor (TF) requires convergent evidence from independent sources. Two significant sources of evidence are TF binding locations and the transcriptional responses to direct TF perturbations. Systematic data sets of both types exist for yeast and human. Standard analysis of the genes whose regulatory DNA is bound by a TF, assayed by ChIP-chip/seq, and the genes that respond to a perturbation of that TF, shows that these two data sources rarely converge on a common set of direct, functional targets. Even taking the few genes that are both bound and responsive as direct functional targets is not safe -- when there are many non-functional binding sites and many indirect targets, non-functional sites are expected to occur in the cis-regulatory DNA of indirect targets by chance. To address this problem, we introduce Dual Threshold Optimization, a new method for setting significance thresholds on binding and response data, and show that it improves convergence. It also enables comparison of binding data to perturbation-response data that has been processed by network inference algorithms, which further improves convergence. Next, we analyze a comprehensive new data set measuring the transcriptional response shortly after inducing overexpression of a yeast TF. We also present a new yeast binding location data set obtained by transposon calling cards and compare it to recent ChIP-exo data. The combination of dual threshold optimization and network inference greatly expands the high-confidence TF network map in both yeast and human. In yeast, measuring the response shortly after inducing TF overexpression and measuring binding locations by using transposon calling cards or ChIP-exo improve the network synergistically.
Project description:We measured transcription factor binding and expression of designed syththetic promoter libraries using Calling Cards Reporter Arrays (CCRAs). In this study, we showed that CCRAs is able to make quantatitive measurements for many TFs in yeast. We then demonstrate the quantitative analysis of cooperative interactions by measuring Cbf1p binding at synthetic promoters with multiple sites. Finally, we characterize the binding and expression of a group of TFs, Tye7p, Gcr1p, and Gcr2p, that act together as a “TF collective”, an important but poorly characterized model of TF cooperativity. We demonstrate that Tye7p often binds promoters without its recognition site because it is recruited by other collective members, whereas these other members require their recognition sites, suggesting a hierarchy where these factors recruit Tye7p but not vice versa. Our experiments establish CCRA as a useful tool for quantitative investigations into TF binding and function.
Project description:Pseudohyphal growth is a developmental pathway seen in some strains of yeast, in which cells elongate and form multicellular filaments in response to environmental stresses. Regulation of this process is complex and involves multiple signaling pathways and a large number of transcription factors (TFs). We used multiplexed transposon M-bM-^@M-^\Calling CardsM-bM-^@M-^] to simultaneously record the genome-wide binding patterns of 28 TFs in nitrogen-starved yeast. We were able to identify TF targets relevant for pseudohyphal growth, producing a detailed map of the regulatory network that governs this process. Using tools from graph theory, we identified 8 transcription factors that lie at the center of this network, including Flo8, Mss11, and Mfg1, which bind as a complex. Surprisingly, the DNA binding preferences for these key transcription factors were not known. Using Calling Card data, we predicted the in vivo DNA binding motif for the Flo8-Mss11-Mfg1 complex and validated it using a reporter assay. We found that this complex binds several important targets, including FLO11, at both their promoter and termination sequences. We demonstrated that this binding pattern is the result of DNA-looping, which regulates the transcription of these targets and is stabilized by an interaction with the nuclear pore complex. This looping provides yeast cells with a transcriptional memory, enabling them to more rapidly execute the filamentous growth program when nitrogen-starved if they have been previously exposed to this condition. These data contain mapped insertion sites from Ty5 transposon "Calling Cards" in M-NM-#1278b yeast sequenced using an Illumina Hiseq. Other sequencing data present are RNA-seq reads.
Project description:Pseudohyphal growth is a developmental pathway seen in some strains of yeast, in which cells elongate and form multicellular filaments in response to environmental stresses. Regulation of this process is complex and involves multiple signaling pathways and a large number of transcription factors (TFs). We used multiplexed transposon “Calling Cards” to simultaneously record the genome-wide binding patterns of 28 TFs in nitrogen-starved yeast. We were able to identify TF targets relevant for pseudohyphal growth, producing a detailed map of the regulatory network that governs this process. Using tools from graph theory, we identified 8 transcription factors that lie at the center of this network, including Flo8, Mss11, and Mfg1, which bind as a complex. Surprisingly, the DNA binding preferences for these key transcription factors were not known. Using Calling Card data, we predicted the in vivo DNA binding motif for the Flo8-Mss11-Mfg1 complex and validated it using a reporter assay. We found that this complex binds several important targets, including FLO11, at both their promoter and termination sequences. We demonstrated that this binding pattern is the result of DNA-looping, which regulates the transcription of these targets and is stabilized by an interaction with the nuclear pore complex. This looping provides yeast cells with a transcriptional memory, enabling them to more rapidly execute the filamentous growth program when nitrogen-starved if they have been previously exposed to this condition.
Project description:Transcription factor (TF)-mediated gene regulation is critical to cellular development and function, an understanding of which has been promoted by the advent of genome-wide TF profiling methodologies. While traditional TF profiling methods in a variety of cell lines have made clear that TF binding is highly diverse amongst cell types, these techniques become challenging to interpret in vivo in complex tissues, such as the brain, which are composed of numerous, distinct cell types. Here we present FLEX calling cards, a virally-mediated system for genome-wide, longitudinal, cell type-specific recording of TF occupancy. We generated cell type-specific TF occupancy profiles in multiple cell types of the mouse brain and demonstrated the ability of this system to record and integrate historical TF binding events across time. FLEX calling cards is now ready for adaptation to any AAV-tractable animal model to investigate cell type-specific, TF-mediated gene regulation in vivo.
Project description:Genome-wide identification of transcription factor (TF) binding sites is pivotal to our understanding of gene expression regulation. Although much progress has been made in the determination of potential binding regions of proteins by chromatin immunoprecipitation (ChIP), this method has some inherent limitations regarding DNA enrichment efficiency and antibody necessity. Here, we report an alternative strategy for assaying in vivo TF-DNA binding in Arabidopsis thaliana cells by tandem chromatin affinity purification (TChAP). Evaluation of TChAP using the E2Fa TF and comparison with traditional ChIP and single chromatin affinity purification illustrates the suitability of TChAP and provides a resource for exploring the E2Fa transcriptional network. Integration with transcriptome, cis-regulatory element, functional enrichment, and co-expression network analyses demonstrates the quality of the E2Fa TChAP-seq data and validates the identification of new direct E2Fa targets. TChAP enhances both TF target mapping throughput, by circumventing issues related to antibody availability, and output, by improving DNA enrichment efficiency. Illumina Seq analysis of E2Fa bound DNA elements isolated using different chromatin isolation methods. BioProject PRJNA172013; SRA study ID SRP014713
Project description:Mammalian transcriptomes display complex circadian rhythms with multiple phases of gene expression that cannot be accounted for by current models of the molecular clock. We have determined the underlying mechanisms by measuring nascent RNA transcription around the clock in mouse liver. Unbiased examination of eRNAs that cluster in specific circadian phases identified functional enhancers driven by distinct transcription factors (TFs). We further identify on a global scale the components of the TF cistromes that function to orchestrate circadian gene expression. Integrated genomic analyses also revealed novel mechanisms by which a single circadian factor controls opposing transcriptional phases. These findings shed new light on the diversity and specificity of TF function in the generation of multiple phases of circadian gene transcription in a mammalian organ. The goal of this experiment was to determine direct targets of Rev-erb{alpha} in mouse liver. All samples were collected at ZT10, when Rev-erb{alpha} protein levels and genomic binding are maximal. All mice were housed and harvested together (n=5 per genotype). All mice were male, 10-12 week old on C57Bl/6 background. RNA was extracted, processed, and hybridized from each mouse liver individually (each sample represents a single mouse).
Project description:Mammalian transcriptomes display complex circadian rhythms with multiple phases of gene expression that cannot be accounted for by current models of the molecular clock. We have determined the underlying mechanisms by measuring nascent RNA transcription around the clock in mouse liver. Unbiased examination of eRNAs that cluster in specific circadian phases identified functional enhancers driven by distinct transcription factors (TFs). We further identify on a global scale the components of the TF cistromes that function to orchestrate circadian gene expression. Integrated genomic analyses also revealed novel mechanisms by which a single circadian factor controls opposing transcriptional phases. These findings shed new light on the diversity and specificity of TF function in the generation of multiple phases of circadian gene transcription in a mammalian organ. The goal of this experiment was to determine direct targets of Rev-erb{alpha} in mouse liver. All samples were collected at ZT10, when Rev-erb{alpha} protein levels and genomic binding are maximal.
Project description:To determine the complement of Ume6-dependent genes expressed during mitosis and/or meiosis in budding yeast we compared wild-type and<br>ume6 deletion strains using Yeast 2.0 high density oligonucleotide microarrays (GeneChips). Samples were analysed from cells growing in rich medium with fermentable (glucose) and non-fermentable (acetate) carbon sources and from cells undergoing meiosis and spore formation in sporulation medium. Expression data were combined with data from a genome-wide Ume6 DNA binding assay and Ume6-target site prediction to identify the most likely direct target genes of Ume6.
Project description:A core task to understand the consequences of non-coding single nucleotide polymorphisms (SNP) is to identify their genotype specific binding of transcription factor (TF). Here, we generate a large-scale TF-SNP interaction map for a selection of 116 colorectal cancer (CRC) risk loci and validated TF binding to 10 putatively functional SNPs. Our data further revealed TF binding complexity adjacent to the 116 risk loci, adding an additional layer of understanding to regulatory networks associated with CRC relevant loci.
Project description:Genome control is operated by transcription factors (TF) controlling their target genes by binding to promoters and enhancers. Conceptually, the interactions between TFs, their binding sites, and their functional targets are represented by gene regulatory networks (GRN). Deciphering in vivo GRNs underlying organ development in an unbiased genome-wide setting involves identifying both functional TF-gene interactions and physical TF-DNA interactions. To reverse-engineer the GRN of eye development in Drosophila, we performed RNA-seq across 72 genetic perturbations and sorted cell types, and inferred a co-expression network. Next, we derived direct TF-DNA interactions using computational motif inference, ultimately connecting 241 TFs to 5632 direct target genes through 24926 enhancers. Using this network we found network motifs, cis-regulatory codes, and new regulators of eye development. We validate the predicted target regions of Grainyhead by ChIP-seq and identify this factor as a general co-factor in the eye network, being bound to thousands of nucleosome-free regions.