Project description:Decoding post-transcriptional regulatory programs underlying gene expression is a crucial step toward a predictive dynamical understanding of cellular state transitions. Despite recent systematic efforts, the sequence determinants of such mechanisms remain largely uncharacterized. An important obstacle in revealing these elements stems from the contribution of local secondary structures in defining interaction partners in a variety of regulatory contexts, including but not limited to transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (e.g. human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. We have developed a computational framework based on context-free grammars and mutual information that systematically explores the immense space of structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behavior. The application of this framework to genome-wide mammalian mRNA stability data revealed eight highly significant elements with substantial structural information, for the strongest of which we showed a major role in global mRNA regulation. Through biochemistry, mass-spectrometry, and in vivo binding studies, we identified HNRPA2B1 as the key regulator that binds this element and stabilizes a large number of its target genes. Ultimately, we created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach can also be employed to reveal the structural elements that modulate other aspects of RNA behavior. This SuperSeries is composed of the following subset Series: GSE35749: sRSM1 synthetic decoy vs. scrambled transfections in MDA-MB-231 cells GSE35753: HNRPA2B1 RIP-chip GSE35756: Whole-genome decay rate measurements in MDA-MB-231 cells transfected with HNRPA2B1 siRNAs versus controls GSE35757: siRNA-mediated HNRPA2B1 knock-down in MDA-MB-231 cells GSE35799: HNRPA2B1 HITS-CLIP Refer to individual Series
Project description:Decoding post-transcriptional regulatory programs underlying gene expression is a crucial step toward a predictive dynamical understanding of cellular state transitions. Despite recent systematic efforts, the sequence determinants of such mechanisms remain largely uncharacterized. An important obstacle in revealing these elements stems from the contribution of local secondary structures in defining interaction partners in a variety of regulatory contexts, including but not limited to transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (e.g. human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. We have developed a computational framework based on context-free grammars and mutual information that systematically explores the immense space of structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behavior. The application of this framework to genome-wide mammalian mRNA stability data revealed eight highly significant elements with substantial structural information, for the strongest of which we showed a major role in global mRNA regulation. Through biochemistry, mass-spectrometry, and in vivo binding studies, we identified HNRPA2B1 as the key regulator that binds this element and stabilizes a large number of its target genes. Ultimately, we created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach can also be employed to reveal the structural elements that modulate other aspects of RNA behavior. This SuperSeries is composed of the SubSeries listed below.
Project description:Decoding post-transcriptional regulatory programs in RNA is a critical step towards the larger goal of developing predictive dynamical models of cellular behaviour. Despite recent efforts, the vast landscape of RNA regulatory elements remains largely uncharacterized. A long-standing obstacle is the contribution of local RNA secondary structure to the definition of interaction partners in a variety of regulatory contexts, including--but not limited to--transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (for example, human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. Here we present a computational framework based on context-free grammars and mutual information that systematically explores the immense space of small structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behaviour. By applying this framework to genome-wide human mRNA stability data, we reveal eight highly significant elements with substantial structural information, for the strongest of which we show a major role in global mRNA regulation. Through biochemistry, mass spectrometry and in vivo binding studies, we identified human HNRPA2B1 (heterogeneous nuclear ribonucleoprotein A2/B1, also known as HNRNPA2B1) as the key regulator that binds this element and stabilizes a large number of its target genes. We created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach could also be used to reveal the structural elements that modulate other aspects of RNA behaviour.
Project description:The specific recognition of splice signals at or near the exon-intron junctions is not explained by their weak conservation across the mammalian transcriptome and postulated to require a multitude of features embedded in the pre-mRNA strand. We explored the possibility of three-dimensional structural scaffold of a pre-mRNA guiding early spliceosomal components to the splice signal sequences. We find that mutation in non-cognate splice signal sequences of a model pre-mRNA substrate could impede recruitment of early spliceosomal components due to disruption of global structure of the pre-mRNA. We also find distribution of pre-mRNA segments potentially interacting with early spliceosomal component U1 snRNP across the intron, spatial proximity of 5′ and 3′ splice sites within the pre-mRNA scaffold, and an interplay between the structural scaffold and splicing regulatory elements in recruiting early spliceosomal components. These results suggest that early spliceosomal components could recognize a three-dimensional structural scaffold beyond the short splice signal sequences and that in our model pre-mRNA, this scaffold is formed across the intron involving the major splice signals. This work provides a conceptual base to extend our understanding of prevalence, distribution, and splicing regulatory potential of recognizable three-dimensional structural scaffolds across the mammalian transcriptome.
Project description:The specific recognition of splice signals at or near exon-intron junctions is not explained by their weak conservation and instead is postulated to require a multitude of features embedded in the pre-mRNA strand. We explored the possibility of three-dimensional structural scaffold of AdML – a model pre-mRNA substrate – guiding early spliceosomal components to the splice signal sequences. We find that mutations in the non-cognate splice signal sequences impede recruitment of early spliceosomal components due to disruption of the global structure of the pre-mRNA. We further find that the pre-mRNA segments potentially interacting with the early spliceosomal component U1 snRNP are distributed across the intron, that there is a spatial proximity of 5′ and 3′ splice sites within the pre-mRNA scaffold, and that an interplay exists between the structural scaffold and splicing regulatory elements in recruiting early spliceosomal components. These results suggest that early spliceosomal components can recognize a three-dimensional structural scaffold beyond the short splice signal sequences, and that in our model pre-mRNA, this scaffold is formed across the intron involving the major splice signals. This provides a conceptual basis to analyze the contribution of recognizable three-dimensional structural scaffolds to the splicing code across the mammalian transcriptome.
Project description:RNA molecules fold into characteristic secondary and tertiary structures that account for their diverse functional activities. Many of these RNA structures, or certain structural motifs within them, are thought to recur in multiple genes within a single organism or across the same gene in several organisms and provide a common regulatory mechanism. Search algorithms, such as RNAMotif, can be used to mine nucleotide sequence databases for these repeating motifs. RNAMotif allows users to capture essential features of known structures in detailed descriptors and can be used to identify, with high specificity, other similar motifs within the nucleotide database. However, when the descriptor constraints are relaxed to provide more flexibility, or when there is very little a priori information about hypothesized RNA structures, the number of motif 'hits' may become very large. Exhaustive methods to search for similar RNA structures over these large search spaces are likely to be computationally intractable. Here we describe a powerful new algorithm based on evolutionary computation to solve this problem. A series of experiments using ferritin IRE and SRP RNA stem-loop motifs were used to verify the method. We demonstrate that even when searching extremely large search spaces, of the order of 10(23) potential solutions, we could find the correct solution in a fraction of the time it would have taken for exhaustive comparisons.
Project description:mRNA molecules are generally thought to be messengers of genetic information in the cell. Stretches of RNA that are complementary in sequence have a propensity to pair, forming elements of secondary structure within RNA molecules. Although these structures will exist in every mRNA molecule, the role they play in gene regulation is not well understood. Currently two techniques are available to profile the cell RNA structure, in-vivo, in an unbiased manner. We applied one of those techniques, DMS-seq, for probing the human mRNA structure in primary foreskin fibroblasts (HFFs) along human cytomegalovirus (HCMV) infection. As a proof of concept, using DMS-seq, we managed to predict the already solved human 28S rRNA structure with high accuracy. Using our data, we are able to show for the first time in-vivo, that human coding sequences (CDSs) are less structured relative to UTRs. Additionally, we provide systematic in-vivo evidences for unwinding of the mRNA by the ribosomes during translation. Intriguingly, we also found structural changes in human CDSs around the start and stop codon, and also in 3’UTRs. The combination of accurate measurements of translation regulation and mapping changes in mRNA structure along a dynamic process can be used as a platform for deciphering cis-regulatory elements that control gene expression in various cell types, organisms and biological processes.
Project description:Here we utilized large-scale systematic probing and screening of ~2000 sequence and structural variants based on two long, perfect RNA hairpins to explore the structure and sequence context determining editability.