Project description:Sea stars and sea urchins are model systems for interrogating the types of deep evolutionary changes that have restructured developmental gene regulatory networks (GRNs). Although cis-regulatory DNA evolution is likely the predominant mechanism of change, it was recently shown that Tbrain, a Tbox transcription factor protein, has evolved a changed preference for a low-affinity, secondary binding motif. The primary, high-affinity motif is conserved. To date, however, no genome-wide comparisons have been performed to provide an unbiased assessment of the evolution of GRNs between these taxa, and no study has attempted to determine the interplay between transcription factor binding motif evolution and GRN topology. The study here measures genome-wide binding of Tbrain orthologs by using ChIP-sequencing and associates these orthologs with putative target genes to assess global function. Targets of both factors are enriched for other regulatory genes, although nonoverlapping sets of functional enrichments in the two datasets suggest a much diverged function. The number of low-affinity binding motifs is significantly depressed in sea urchins compared with sea star, but both motif types are associated with genes from a range of functional categories. Only a small fraction (∼10%) of genes are predicted to be orthologous targets. Collectively, these data indicate that Tbr has evolved significantly different developmental roles in these echinoderms and that the targets and the binding motifs in associated cis-regulatory sequences are dispersed throughout the hierarchy of the GRN, rather than being biased toward terminal process or discrete functional blocks, which suggests extensive evolutionary tinkering.
Project description:Eukaryotic transcription factors (TFs) from the same structural family tend to bind similar DNA sequences, despite the ability of these TFs to execute distinct functions in vivo. The cell partly resolves this specificity paradox through combinatorial strategies and the use of low-affinity binding sites, which are better able to distinguish between similar TFs. However, because these sites have low affinity, it is challenging to understand how TFs recognize them in vivo. Here, we summarize recent findings and technological advancements that allow for the quantification and mechanistic interpretation of TF recognition across a wide range of affinities. We propose a model that integrates insights from the fields of genetics and cell biology to provide further conceptual understanding of TF binding specificity. We argue that in eukaryotes, target specificity is driven by an inhomogeneous 3D nuclear distribution of TFs and by variation in DNA binding affinity such that locally elevated TF concentration allows low-affinity binding sites to be functional.
Project description:Sequence-specific binding by transcription factors (TFs) interprets regulatory information encoded in the genome. Using recently published universal protein binding microarray (PBM) data on the in vitro DNA binding preferences of these proteins for all possible 8-base-pair sequences, we examined the evolutionary conservation and enrichment within putative regulatory regions of the binding sequences of a diverse library of 104 nonredundant mouse TFs spanning 22 different DNA-binding domain structural classes. We found that not only high affinity binding sites, but also numerous moderate and low affinity binding sites, are under negative selection in the mouse genome. These 8-mers occur preferentially in putative regulatory regions of the mouse genome, including CpG islands and non-exonic ultraconserved elements (UCEs). Of TFs whose PBM "bound" 8-mers are enriched within sets of tissue-specific UCEs, many are expressed in the same tissue(s) as the UCE-driven gene expression. Phylogenetically conserved motif occurrences of various TFs were also enriched in the noncoding sequence surrounding numerous gene sets corresponding to Gene Ontology categories and tissue-specific gene expression clusters, suggesting involvement in transcriptional regulation of those genes. Altogether, our results indicate that many of the sequences bound by these proteins in vitro, including lower affinity DNA sequences, are likely to be functionally important in vivo. This study not only provides an initial analysis of the potential regulatory associations of 104 mouse TFs, but also presents an approach for the functional analysis of TFs from any other metazoan genome as their DNA binding preferences are determined by PBMs or other technologies.
Project description:Growth hormone regulates its biological properties via a sequential hormone-induced receptor homodimerization mechanism. Using a mutagenesis-scanning analysis of 81 single and 32 pairwise double mutations, we show that the hormone's two spatially distal receptor binding sites (Site1 and Site2) are allosterically coupled. These allosteric effects are focused among a relatively few residues centered around the interaction between Asp-116 of the hormone and Trp-169 of the receptor in Site2. A rearrangement of this interaction triggered by mutations in Site1 produces both a major conformation and energetic reorganization of Site2, surprisingly without a reduction in overall binding affinity. Additionally, the data suggest a change in the conformational dynamics of several groups in Site2 that appear to be important in defining the Site2 interaction. Changes in binding energy of the affected Site2 residues usually range in magnitude from 3- to 60-fold, but in one case are as large as 10(4).
Project description:BackgroundHigh-throughput in vivo protein-DNA interaction experiments are currently widely used in gene regulation studies. Hitherto, comprehensive data analysis remains a challenge and for that reason most computational methods only consider the top few hundred or thousand strongest protein binding sites whereas weak protein binding sites are completely ignored.ResultsA new biophysical model of protein-DNA interactions, BayesPI2+, was developed to address the above-mentioned challenges. BayesPI2+ can be run in either a serial computation model or a parallel ensemble learning framework. BayesPI2+ allowed us to analyze all binding sites of the transcription factors, including weak binding that cannot be analyzed by other models. It is evaluated in both synthetic and real in vivo protein-DNA binding experiments. Analysing ESR1 and SPIB in breast carcinoma and activated B cell-like diffuse large B-cell lymphoma cell lines, respectively, revealed that the concerted binding to high and low affinity sites correlates best with gene expression.ConclusionsBayesPI2+ allows us to analyze transcription factor binding on a larger scale than hitherto achieved. By this analysis, we were able to demonstrate that genes are regulated by concerted binding to high and low affinity binding sites. The program and output results are publicly available at: http://folk.uio.no/junbaiw/BayesPI2Plus.
Project description:Position weight matrix (PWM) is the traditional motif model representing the transcription factor (TF) binding sites. It proposes that the positions contribute independently to TFs binding affinity, although this hypothesis does not fit the data perfectly. This explains why PWM hits are missing in a substantial fraction of ChIP-seq peaks. To study various modes of the direct binding of plant TFs, we compiled the benchmark collection of 111 ChIP-seq datasets for Arabidopsis thaliana, and applied the traditional PWM, and two alternative motif models BaMM and SiteGA, proposing the dependencies of the positions. The variation in the stringency of the recognition thresholds for the models proposed that the hits of PWM, BaMM, and SiteGA models are associated with the sites of high/medium, any, and low affinity, respectively. At the medium recognition threshold, about 60% of ChIP-seq peaks contain PWM hits consisting of conserved core consensuses, while BaMM and SiteGA provide hits for an additional 15% of peaks in which a weaker core consensus is compensated through intra-motif dependencies. The presence/absence of these dependencies in the motifs of alternative/traditional models was confirmed by the dependency logo DepLogo visualizing the position-wise partitioning of the alignments of predicted sites. We exemplify the detailed analysis of ChIP-seq profiles for plant TFs CCA1, MYC2, and SEP3. Gene ontology (GO) enrichment analysis revealed that among the three motif models, the SiteGA had the highest portions of genes with the significantly enriched GO terms among all predicted genes. We showed that both alternative motif models provide for traditional PWM greater extensions in predicted sites for TFs MYC2/SEP3 with condition/tissue specific functions, compared to those for TF CCA1 with housekeeping functions. Overall, the combined application of standard and alternative motif models is beneficial to detect various modes of the direct TF-DNA interactions in the maximal portion of ChIP-seq loci.
Project description:Myosin VI is involved in many cellular processes ranging from endocytosis to transcription. This multifunctional potential is achieved through alternative isoform splicing and through interactions of myosin VI with a diverse network of binding partners. However, the interplay between these two modes of regulation remains unexplored. To this end, we compared two different binding partners and their interactions with myosin VI by exploring the kinetic properties of recombinant proteins and their distribution in mammalian cells using fluorescence imaging. We found that selectivity for these binding partners is achieved through a high-affinity motif and a low-affinity motif within myosin VI. These two motifs allow competition among partners for myosin VI. Exploring how this competition affects the activity of nuclear myosin VI, we demonstrate the impact of a concentration-driven interaction with the low-affinity binding partner DAB2, finding that this interaction blocks the ability of nuclear myosin VI to bind DNA and its transcriptional activity in vitro We conclude that loss of DAB2, a tumor suppressor, may enhance myosin VI-mediated transcription. We propose that the frequent loss of specific myosin VI partner proteins during the onset of cancer leads to a higher level of nuclear myosin VI activity.
Project description:This SuperSeries is composed of the SubSeries listed below. CTCF is a DNA-binding protein which plays critical roles in chromatin structure organization and transcriptional regulation; however, little is known about the functional determinants of different CTCF binding sites (CBS). Using a conditional mouse model, we have identified one set of CBSs that are lost upon CTCF depletion (lost CBSs) and another set that persists (retained CBSs). Retained CBSs are more similar to the consensus CTCF binding sequence and usually span tandem CTCF peaks. Lost CBSs are enriched at enhancers and promoters and associate with active chromatin marks and higher transcriptional activity. In contrast, retained CBSs are enriched at TAD and loop boundaries. Integration of ChIP-seq and RNA-seq data has revealed that retained CBSs are located at the boundaries between distinct chromatin states, acting as chromatin barriers. Our results provide evidence that transient, lost CBSs are involved in transcriptional regulation, whereas retained CBSs are critical for establishing higher-order chromatin architecture.
Project description:CTCF is a DNA-binding protein which plays critical roles in chromatin structure organization and transcriptional regulation; however, little is known about the functional determinants of different CTCF-binding sites (CBS). Using a conditional mouse model, we have identified one set of CBSs that are lost upon CTCF depletion (lost CBSs) and another set that persists (retained CBSs). Retained CBSs are more similar to the consensus CTCF-binding sequence and usually span tandem CTCF peaks. Lost CBSs are enriched at enhancers and promoters and associate with active chromatin marks and higher transcriptional activity. In contrast, retained CBSs are enriched at TAD and loop boundaries. Integration of ChIP-seq and RNA-seq data has revealed that retained CBSs are located at the boundaries between distinct chromatin states, acting as chromatin barriers. Our results provide evidence that transient, lost CBSs are involved in transcriptional regulation, whereas retained CBSs are critical for establishing higher-order chromatin architecture.