Project description:Plant development is controlled by transcription factors (TFs) which form complex gene-regulatory networks. Genome-wide TF DNA-binding studies revealed that these TFs have several thousands of binding sites in the Arabidopsis genome, and may regulate the expression of many genes directly. Given the importance of natural variation in plant developmental programs, there is a need to understand the molecular basis of this variation at the level of developmental gene regulation. However, until now, the evolutionary turnover and dynamics of TF binding sites among plant species has not yet experimentally determined. Here, we performed comparative ChIP-seq studies of the MADS-box TF SEPALLATA3 (SEP3) in inflorescences of two Arabidopsis species: A. thaliana and A. lyrata. Comparative RNA-seq analysis shows that the loss/gain of BSs is often followed by a change in gene expression.
Project description:Plant development is controlled by transcription factors (TFs) which form complex gene-regulatory networks. Genome-wide TF DNA-binding studies revealed that these TFs have several thousands of binding sites in the Arabidopsis genome, and may regulate the expression of many genes directly. Given the importance of natural variation in plant developmental programs, there is a need to understand the molecular basis of this variation at the level of developmental gene regulation. However, until now, the evolutionary turnover and dynamics of TF binding sites among plant species has not yet experimentally determined. Here, we performed comparative ChIP-seq studies of the MADS-box TF SEPALLATA3 (SEP3) in inflorescences of two Arabidopsis species: A. thaliana and A. lyrata. Comparative RNA-seq analysis shows that the loss/gain of BSs is often followed by a change in gene expression.
Project description:Flower development is controlled by the action of key regulatory transcription factors of the MADS-domain family. The function of these factors appears to be highly conserved among species based on mutant phenotypes. However, the conservation of their downstream processes is much less well understood, mostly because the evolutionary turnover and variation of their DNA-binding sites (BSs) among plant species have not yet been experimentally determined. Here, we performed comparative ChIP (chromatin immunoprecipitation)-seq experiments of the MADS-domain transcription factor SEPALLATA3 (SEP3) in two closely related Arabidopsis species: Arabidopsis thaliana and A. lyrata which have very similar floral organ morphology. We found that BS conservation is associated with DNA sequence conservation, the presence of the CArG-box BS motif and on the relative position of the BS to its potential target gene. Differences in genome size and structure can explain that SEP3 BSs in A. lyrata can be located more distantly to their potential target genes than their counterparts in A. thaliana. In A. lyrata, we identified transposition as a mechanism to generate novel SEP3 binding locations in the genome. Comparative gene expression analysis shows that the loss/gain of BSs is associated with a change in gene expression. In summary, this study investigates the evolutionary dynamics of DNA BSs of a floral key-regulatory transcription factor and explores factors affecting this phenomenon.
Project description:Protein-protein interactions (PPIs) have widely acknowledged roles in the regulation of development, but few studies have addressed the timing and mechanism of shifting PPIs over evolutionary history. The B-class MADS-box transcription factors, PISTILLATA (PI) and APETALA3 (AP3) are key regulators of floral development. PI-like (PI(L)) and AP3-like (AP3(L)) proteins from a number of plants, including Arabidopsis thaliana (Arabidopsis) and the grass Zea mays (maize), bind DNA as obligate heterodimers. However, a PI(L) protein from the grass relative Joinvillea can bind DNA as a homodimer. To ascertain whether Joinvillea PI(L) homodimerization is an anomaly or indicative of broader trends, we characterized PI(L) dimerization across the Poales and uncovered unexpected evolutionary lability. Both obligate B-class heterodimerization and PI(L) homodimerization have evolved multiple times in the order, by distinct molecular mechanisms. For example, obligate B-class heterodimerization in maize evolved very recently from PI(L) homodimerization. A single amino acid change, fixed during domestication, is sufficient to toggle one maize PI(L) protein between homodimerization and obligate heterodimerization. We detected a signature of positive selection acting on residues preferentially clustered in predicted sites of contact between MADS-box monomers and dimers, and in motifs that mediate MADS PPI specificity in Arabidopsis. Changing one positively selected residue can alter PI(L) dimerization activity. Furthermore, ectopic expression of a Joinvillea PI(L) homodimer in Arabidopsis can homeotically transform sepals into petals. Our results provide a window into the evolutionary remodeling of PPIs, and show that novel interactions have the potential to alter plant form in a context-dependent manner.
Project description:Binding sites in proteins can be either specifically functional binding sites (active sites) that bind specific substrates with high affinity or regulatory binding sites (allosteric sites), that modulate the activity of functional binding sites through effector molecules. Owing to their significance in determining protein function, the identification of protein functional and regulatory binding sites is widely acknowledged as an important biological problem. In this work, we present a novel binding site prediction method, Active and Regulatory site Prediction (AR-Pred), which supplements protein geometry, evolutionary, and physicochemical features with information about protein dynamics to predict putative active and allosteric site residues. As the intrinsic dynamics of globular proteins plays an essential role in controlling binding events, we find it to be an important feature for the identification of protein binding sites. We train and validate our predictive models on multiple balanced training and validation sets with random forest machine learning and obtain an ensemble of discrete models for each prediction type. Our models for active site prediction yield a median area under the curve (AUC) of 91% and Matthews correlation coefficient (MCC) of 0.68, whereas the less well-defined allosteric sites are predicted at a lower level with a median AUC of 80% and MCC of 0.48. When tested on an independent set of proteins, our models for active site prediction show comparable performance to two existing methods and gains compared to two others, while the allosteric site models show gains when tested against three existing prediction methods. AR-Pred is available as a free downloadable package at https://github.com/sambitmishra0628/AR-PRED_source.
Project description:MicroRNAs (miRNAs) control the abundance of the majority of the vertebrate transcriptome. The recognition sequences, or target sites, for bilaterian miRNAs are found predominantly in the 3' untranslated regions (3'UTRs) of mRNAs, and are amongst the most highly conserved motifs within 3'UTRs. However, little is known regarding the evolutionary pressures that lead to loss and gain of such target sites. Here, we quantify the selective pressures that act upon miRNA target sites. Notably, selective pressure extends beyond deeply conserved binding sites to those that have undergone recent substitutions. Our approach reveals that even amongst ancient animal miRNAs, which exert the strongest selective pressures on 3'UTR sequences, there are striking differences in patterns of target site evolution between miRNAs. Considering only ancient animal miRNAs, we find three distinct miRNA groups, each exhibiting characteristic rates of target site gain and loss during mammalian evolution. The first group both loses and gains sites rarely. The second group shows selection only against site loss, with site gains occurring at a neutral rate, whereas the third loses and gains sites at neutral or above expected rates. Furthermore, mutations that alter the strength of existing target sites are disfavored. Applying our approach to individual transcripts reveals variation in the distribution of selective pressure across the transcriptome and between miRNAs, ranging from strong selection acting on a small subset of targets of some miRNAs, to weak selection on many targets for other miRNAs. miR-20 and miR-30, and many other miRNAs, exhibit broad, deeply conserved targeting, while several other comparably ancient miRNAs show a lack of selective constraint, and a small number, including mir-146, exhibit evidence of rapidly evolving target sites. Our approach adds valuable perspective on the evolution of miRNAs and their targets, and can also be applied to characterize other 3'UTR regulatory motifs.
Project description:Tandem repeats of DNA that contain transcription factor (TF) binding sites could serve as decoys, competitively binding to TFs and affecting target gene expression. Using a synthetic system in budding yeast, we demonstrate that repeated decoy sites inhibit gene expression by sequestering a transcriptional activator and converting the graded dose-response of target promoters to a sharper, sigmoidal-like response. On the basis of both modeling and chromatin immunoprecipitation measurements, we attribute the altered response to TF binding decoy sites more tightly than promoter binding sites. Tight TF binding to arrays of contiguous repeated decoy sites only occurs when the arrays are mostly unoccupied. Finally, we show that the altered sigmoidal-like response can convert the graded response of a transcriptional positive-feedback loop to a bimodal response. Together, these results show how changing numbers of repeated TF binding sites lead to qualitative changes in behavior and raise new questions about the stability of TF/promoter binding.