Project description:Decoding post-transcriptional regulatory programs underlying gene expression is a crucial step toward a predictive dynamical understanding of cellular state transitions. Despite recent systematic efforts, the sequence determinants of such mechanisms remain largely uncharacterized. An important obstacle in revealing these elements stems from the contribution of local secondary structures in defining interaction partners in a variety of regulatory contexts, including but not limited to transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (e.g. human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. We have developed a computational framework based on context-free grammars and mutual information that systematically explores the immense space of structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behavior. The application of this framework to genome-wide mammalian mRNA stability data revealed eight highly significant elements with substantial structural information, for the strongest of which we showed a major role in global mRNA regulation. Through biochemistry, mass-spectrometry, and in vivo binding studies, we identified HNRPA2B1 as the key regulator that binds this element and stabilizes a large number of its target genes. Ultimately, we created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach can also be employed to reveal the structural elements that modulate other aspects of RNA behavior. This SuperSeries is composed of the following subset Series: GSE35749: sRSM1 synthetic decoy vs. scrambled transfections in MDA-MB-231 cells GSE35753: HNRPA2B1 RIP-chip GSE35756: Whole-genome decay rate measurements in MDA-MB-231 cells transfected with HNRPA2B1 siRNAs versus controls GSE35757: siRNA-mediated HNRPA2B1 knock-down in MDA-MB-231 cells GSE35799: HNRPA2B1 HITS-CLIP Refer to individual Series
Project description:Decoding post-transcriptional regulatory programs underlying gene expression is a crucial step toward a predictive dynamical understanding of cellular state transitions. Despite recent systematic efforts, the sequence determinants of such mechanisms remain largely uncharacterized. An important obstacle in revealing these elements stems from the contribution of local secondary structures in defining interaction partners in a variety of regulatory contexts, including but not limited to transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (e.g. human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. We have developed a computational framework based on context-free grammars and mutual information that systematically explores the immense space of structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behavior. The application of this framework to genome-wide mammalian mRNA stability data revealed eight highly significant elements with substantial structural information, for the strongest of which we showed a major role in global mRNA regulation. Through biochemistry, mass-spectrometry, and in vivo binding studies, we identified HNRPA2B1 as the key regulator that binds this element and stabilizes a large number of its target genes. Ultimately, we created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach can also be employed to reveal the structural elements that modulate other aspects of RNA behavior. This SuperSeries is composed of the SubSeries listed below.
Project description:Decoding post-transcriptional regulatory programs in RNA is a critical step towards the larger goal of developing predictive dynamical models of cellular behaviour. Despite recent efforts, the vast landscape of RNA regulatory elements remains largely uncharacterized. A long-standing obstacle is the contribution of local RNA secondary structure to the definition of interaction partners in a variety of regulatory contexts, including--but not limited to--transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (for example, human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. Here we present a computational framework based on context-free grammars and mutual information that systematically explores the immense space of small structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behaviour. By applying this framework to genome-wide human mRNA stability data, we reveal eight highly significant elements with substantial structural information, for the strongest of which we show a major role in global mRNA regulation. Through biochemistry, mass spectrometry and in vivo binding studies, we identified human HNRPA2B1 (heterogeneous nuclear ribonucleoprotein A2/B1, also known as HNRNPA2B1) as the key regulator that binds this element and stabilizes a large number of its target genes. We created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach could also be used to reveal the structural elements that modulate other aspects of RNA behaviour.
Project description:The specific recognition of splice signals at or near the exon-intron junctions is not explained by their weak conservation across the mammalian transcriptome and postulated to require a multitude of features embedded in the pre-mRNA strand. We explored the possibility of three-dimensional structural scaffold of a pre-mRNA guiding early spliceosomal components to the splice signal sequences. We find that mutation in non-cognate splice signal sequences of a model pre-mRNA substrate could impede recruitment of early spliceosomal components due to disruption of global structure of the pre-mRNA. We also find distribution of pre-mRNA segments potentially interacting with early spliceosomal component U1 snRNP across the intron, spatial proximity of 5′ and 3′ splice sites within the pre-mRNA scaffold, and an interplay between the structural scaffold and splicing regulatory elements in recruiting early spliceosomal components. These results suggest that early spliceosomal components could recognize a three-dimensional structural scaffold beyond the short splice signal sequences and that in our model pre-mRNA, this scaffold is formed across the intron involving the major splice signals. This work provides a conceptual base to extend our understanding of prevalence, distribution, and splicing regulatory potential of recognizable three-dimensional structural scaffolds across the mammalian transcriptome.
Project description:The specific recognition of splice signals at or near exon-intron junctions is not explained by their weak conservation and instead is postulated to require a multitude of features embedded in the pre-mRNA strand. We explored the possibility of three-dimensional structural scaffold of AdML – a model pre-mRNA substrate – guiding early spliceosomal components to the splice signal sequences. We find that mutations in the non-cognate splice signal sequences impede recruitment of early spliceosomal components due to disruption of the global structure of the pre-mRNA. We further find that the pre-mRNA segments potentially interacting with the early spliceosomal component U1 snRNP are distributed across the intron, that there is a spatial proximity of 5′ and 3′ splice sites within the pre-mRNA scaffold, and that an interplay exists between the structural scaffold and splicing regulatory elements in recruiting early spliceosomal components. These results suggest that early spliceosomal components can recognize a three-dimensional structural scaffold beyond the short splice signal sequences, and that in our model pre-mRNA, this scaffold is formed across the intron involving the major splice signals. This provides a conceptual basis to analyze the contribution of recognizable three-dimensional structural scaffolds to the splicing code across the mammalian transcriptome.
Project description:Studying the functional consequences of structural variants (SVs) in mammalian genomes is challenging because: 1) SVs arise much less commonly than single nucleotide variants or small indels; and 2) methods to generate, map and characterize SVs in model systems are underdeveloped. To address these challenges, we developed Genome-Shuffle-seq, a method that enables the multiplex generation and mapping of thousands of SVs (deletions, inversions, translocations, extrachromosomal circles) throughout mammalian genomes. We also demonstrate the co-capture of SV identity with single-cell transcriptomes, facilitating the measurement of SVs’ impact on gene expression. We anticipate Genome-Shuffle-seq will be broadly useful for the systematic exploration of the functional consequences of SVs on gene expression, chromatin landscape, and 3D nuclear architecture, while also initiating a path towards a minimal mammalian genome.
Project description:Studying the functional consequences of structural variants (SVs) in mammalian genomes is challenging because: 1) SVs arise much less commonly than single nucleotide variants or small indels; and 2) methods to generate, map and characterize SVs in model systems are underdeveloped. To address these challenges, we developed Genome-Shuffle-seq, a method that enables the multiplex generation and mapping of thousands of SVs (deletions, inversions, translocations, extrachromosomal circles) throughout mammalian genomes. We also demonstrate the co-capture of SV identity with single-cell transcriptomes, facilitating the measurement of SVs’ impact on gene expression. We anticipate Genome-Shuffle-seq will be broadly useful for the systematic exploration of the functional consequences of SVs on gene expression, chromatin landscape, and 3D nuclear architecture, while also initiating a path towards a minimal mammalian genome.
Project description:Studying the functional consequences of structural variants (SVs) in mammalian genomes is challenging because: 1) SVs arise much less commonly than single nucleotide variants or small indels; and 2) methods to generate, map and characterize SVs in model systems are underdeveloped. To address these challenges, we developed Genome-Shuffle-seq, a method that enables the multiplex generation and mapping of thousands of SVs (deletions, inversions, translocations, extrachromosomal circles) throughout mammalian genomes. We also demonstrate the co-capture of SV identity with single-cell transcriptomes, facilitating the measurement of SVs’ impact on gene expression. We anticipate Genome-Shuffle-seq will be broadly useful for the systematic exploration of the functional consequences of SVs on gene expression, chromatin landscape, and 3D nuclear architecture, while also initiating a path towards a minimal mammalian genome.