Project description:Genomic regions flanking E-box sites influence DNA binding specificity of bHLH transcription factors through DNA shape (different concentrations)
Project description:DNA sequence is a major determinant of the binding specificity of transcription factors (TFs) for their genomic targets. However, eukaryotic cells often express, at the same time, TFs with highly similar DNA binding motifs but distinct in vivo targets. Currently, it is not well understood how TFs with seemingly identical DNA motifs achieve unique specificities in vivo. Here, we used custom protein binding microarrays to analyze TF specificity for putative binding sites in their genomic sequence context. Using yeast TFs Cbf1 and Tye7 as our case study, we found that binding sites of these bHLH TFs (i.e., E-boxes) are bound differently in vitro and in vivo, depending on their genomic context. Computational analyses suggest that nucleotides outside E-box binding sites contribute to specificity by influencing the 3D structure of DNA binding sites. Thus, local shape of target sites might play a widespread role in achieving regulatory specificity within TF families. Three protein binding microarray (PBM) experiments of Saccharomyces cerevisiae transcription factors were performed. Briefly, the PBMs involved binding GST-tagged yeast transcription factors Cbf1 and Tye7 to double-stranded 44K Agilent microarrays in order to determine their binding specificity for putative DNA binding sites in native genomic context. Briefly, we represent three categories of 30-bp genomic sequences: 1) ChIP-chip bound probes, 2) ChIP-chip unbound probes, and 3) negative control probes. ChIP-chip bound probes corresponded to genomic regions bound in vivo by Cbf1 or Tye7 (ChIP-chip P < 0.005 in rich medium (YPD) (Harbison et al., Nature 2004, PMID 15343339)) contained at least two consecutive 8-mers with universal PBM E-score > 0.35 (Zhu et al., Genome Research 2009, PMID 19158363). All putative binding sites occurred at the same position within the probes on the array. M-bM-^@M-^\ChIP-chip unboundM-bM-^@M-^] probes corresponded to genomic regions with ChIP-chip P > 0.5 and at least two consecutive 8-mers at a more stringent universal PBM E-score threshold of 0.4. Negative control probes corresponded to S. cerevisiae intergenic regions with a maximum 8-mer E-score < 0.3. We also designed probes that contain, within constant flanking regions, all 10-bp sequences that occur within the M-bM-^@M-^\ChIP-chip boundM-bM-^@M-^] probes and contain the E-box CACGTG, but are flanked by synthetic rather than native genomic sequence. Each DNA sequence represented on the array is present in 4 replicate spots. We report the PBM signal intensity for each spot. The PBM protocol is described in Berger et al., Nature Biotechnology 2006 (PMID 16998473).
Project description:DNA sequence is a major determinant of the binding specificity of transcription factors (TFs) for their genomic targets. However, eukaryotic cells often express, at the same time, TFs with highly similar DNA binding motifs but distinct in vivo targets. Currently, it is not well understood how TFs with seemingly identical DNA motifs achieve unique specificities in vivo. Here, we used custom protein binding microarrays to analyze TF specificity for putative binding sites in their genomic sequence context. Using yeast TFs Cbf1 and Tye7 as our case study, we found that binding sites of these bHLH TFs (i.e., E-boxes) are bound differently in vitro and in vivo, depending on their genomic context. Computational analyses suggest that nucleotides outside E-box binding sites contribute to specificity by influencing the 3D structure of DNA binding sites. Thus, local shape of target sites might play a widespread role in achieving regulatory specificity within TF families.
Project description:DNA sequence is a major determinant of the binding specificity of transcription factors (TFs) for their genomic targets. However, eukaryotic cells often express, at the same time, TFs with highly similar DNA binding motifs but distinct in vivo targets. Currently, it is not well understood how TFs with seemingly identical DNA motifs achieve unique specificities in vivo. Here, we used custom protein binding microarrays to analyze TF specificity for putative binding sites in their genomic sequence context. Using yeast TFs Cbf1 and Tye7 as our case study, we found that binding sites of these bHLH TFs (i.e., E-boxes) are bound differently in vitro and in vivo, depending on their genomic context. Computational analyses suggest that nucleotides outside E-box binding sites contribute to specificity by influencing the 3D structure of DNA binding sites. Thus, local shape of target sites might play a widespread role in achieving regulatory specificity within TF families.
Project description:DNA sequence is a major determinant of the binding specificity of transcription factors (TFs) for their genomic targets. However, eukaryotic cells often express, at the same time, TFs with highly similar DNA binding motifs but distinct in vivo targets. Currently, it is not well understood how TFs with seemingly identical DNA motifs achieve unique specificities in vivo. Here, we used custom protein binding microarrays to analyze TF specificity for putative binding sites in their genomic sequence context. Using yeast TFs Cbf1 and Tye7 as our case study, we found that binding sites of these bHLH TFs (i.e., E-boxes) are bound differently in vitro and in vivo, depending on their genomic context. Computational analyses suggest that nucleotides outside E-box binding sites contribute to specificity by influencing the 3D structure of DNA binding sites. Thus, local shape of target sites might play a widespread role in achieving regulatory specificity within TF families. Two protein binding microarray (PBM) experiments of Saccharomyces cerevisiae transcription factors were performed. Briefly, the PBMs involved binding GST-tagged yeast transcription factors Cbf1 and Tye7 to double-stranded 44K Agilent microarrays in order to determine the accuracy of our regression models for TF-DNA binding specificity. Briefly, this array contains 30-bp genomic sequences from our initial custom array (Gordan et al 2013, submitted), with 0 through 4 mutations designed at various positions in the genomic sequences. Each sequence in represented in 6 replicate spots. We report the PBM signal intensity for each spot. The PBM protocol is described in Berger et al., Nature Biotechnology 2006 (PMID 16998473).
Project description:MyoD and NeuroD2 are master regulators of myogenesis and neurogenesis and bind to a "shared" E-box sequence (CAGCTG) and a "private" sequence (CAGGTG or CAGATG, respectively). To determine whether private-site recognition is sufficient to confer lineage-specification, we generated a MyoD-mutant with the DNA binding specificity of NeuroD2. Our results demonstrate that redirecting MyoD binding from MyoD-private sites to NeuroD2-private sites, despite preserved binding to the MyoD/NeuroD2-shared sites, is sufficient to change MyoD from a master regulator of myogenesis to a master regulator of neurogenesis. RNA-seq profiling of mouse P19 cells transfected with MyoD, NeuroD2 and chimera mutants. The chimeric mutants are MyoD with the bHLH domain replaced with the NeuroD2 bHLH domain.
Project description:MyoD and NeuroD2 are master regulators of myogenesis and neurogenesis and bind to a "shared" E-box sequence (CAGCTG) and a "private" sequence (CAGGTG or CAGATG, respectively). To determine whether private-site recognition is sufficient to confer lineage-specification, we generated a MyoD-mutant with the DNA binding specificity of NeuroD2. Our results demonstrate that redirecting MyoD binding from MyoD-private sites to NeuroD2-private sites, despite preserved binding to the MyoD/NeuroD2-shared sites, is sufficient to change MyoD from a master regulator of myogenesis to a master regulator of neurogenesis. ChIP-seq profiling of MyoD, NeuroD2 and chimera mutants in mouse P19 cells transfected with these genes. The chimeric mutants are MyoD with the bHLH domain replaced with the NeuroD2 bHLH domain.
Project description:Accurate predictions of the DNA binding specificities of transcription factors (TFs) are necessary for understanding gene regulatory mechanisms. Traditionally, predictive models are built based on nucleotide sequence features. Here, we employed three- dimensional DNA shape information obtained on a high-throughput basis to integrate intuitive DNA structural features into the modeling of TF binding specificities using support vector regression. We performed quantitative predictions of DNA binding specificities, using the DREAM5 dataset for 65 mouse TFs and genomic-context protein binding microarray data for three human basic helix-loop-helix TFs. DNA shape-augmented models compared favorably with sequence-based models for these predictions. Although both k-mer and DNA shape features encoded the interdependencies between nucleotide positions of the binding site, using DNA shape features reduced the dimensionality of the feature space compared to k-mer use. Finally, analyzing the weights of DNA shape-augmented models uncovered TF family- specific structural readout mechanisms that were not obvious from the nucleotide sequence. Three genomic-context protein binding microarray (gcPBM) experiments of human transcription factors were performed. Briefly, the gcPBMs involved binding his-tagged transcription factors c-Myc, Max, and Mad1(Mxd1) to double-stranded 180K Agilent microarrays in order to determine their binding specificity for putative DNA binding sites in native genomic context. Briefly, we represent three categories of 36-bp sequences: 1) bound probes, 2) unbound probes (or negative controls), and 3) test probes. Bound probes corresponded to genomic regions bound in vivo by c-Myc, Max, or Mad2 (ChIP-seq P < 10^(-10) in HeLaS3 or K562 celld (ENCODE)) that contain at least two consecutive 8-mers with universal PBM E-score > 0.4 (Munteanu and Gordan, LNCS 2013). All putative binding sites occur at the same position within the probes on the array. M-bM-^@M-^\UnboundM-bM-^@M-^] probes corresponded to genomic regions with ChIP-seq P < 10^(-10) and a maximum 8-mer E-score < 0.2. We also designed test probes that contain, within constant flanking regions, all nnCACGTGnn 10-mers and 18 nnnCACGTGnnn 12-mers (where n = A, C, G, or T). Each DNA sequence represented on the array is present in 6 replicate spots. We report the gcPBM signal intensity for each spot. The PBM protocol is described in Berger et al., Nature Biotechnology 2006 (PMID 16998473).
Project description:Basic helix-loop-helix (bHLH) proneural transcription factors (TFs) Ascl1 and Neurog2 are integral to the development of the nervous system. Here, we investigated the molecular mechanisms by which Ascl1 and Neurog2 control the acquisition of generic neuronal fate and impose neuronal subtype identity. Using direct neuronal programming of embryonic stem cells, we found that Ascl1 and Neurog2 regulate distinct targets by binding to largely different sets of sites. Their divergent binding pattern is not determined by the previous chromatin state but distinguished by specific E-box enrichments which reflect the DNA sequence preference of the bHLH domain. The divergent Ascl1 and Neurog2 binding patterns result in distinct chromatin accessibility and enhancer activity landscapes that shape the binding and activity of downstream TFs during neuronal specification. Our findings suggest that proneural factors contribute to neuronal diversity by differentially altering the chromatin landscapes that shape the binding of neuronally expressed TFs.