Project description:MicroRNA (miRNA) maturation is critically dependent on structural features of primary transcripts (pri-miRNAs). However, the scarcity of determined pri-miRNA structures has limited our understanding of miRNA maturation. Here we employed SHAPE-MaP, a high-throughput RNA structure probing method, to unravel the secondary structures of 476 high-confidence human pri-miRNAs. Our SHAPE-based structures diverge substantially from those inferred solely from computation, particularly in the apical loop and basal segments, underlining the need for experimental data in RNA structure prediction. By comparing the structures with high-throughput processing data, we determined the optimal structural features of pri-miRNAs. The sequence determinants are influenced substantially by their structural contexts. Moreover, we identified an element termed the bulged GWG motif (bGWG) with a 3′ bulge in the lower stem, which promotes processing. Our structure-function mapping better annotates the determinants of pri-miRNA processing and offers practical implications for designing small hairpin RNAs and predicting the impacts of miRNA mutations.
Project description:Many lncRNAs have been discovered using transcriptomic data, however, it is unclear what fraction of lncRNAs is functional and what structural properties affect their phenotype. MUNC lncRNA (also known as DRReRNA) acts as an enhancer RNA for the Myod1 gene in cis and stimulates the expression of other promyogenic genes in trans by recruiting the cohesin complex. Here, experimental probing of the RNA structure revealed that MUNC contains multiple structural domains not detected by prediction algorithms in the absence of experimental information. We show that these specific and structurally distinct domains are required for induction of promyogenic genes, for binding genomic sites and gene expression regulation, and for binding the cohesin complex. Myod1 induction and cohesin interaction comprise only a subset of MUNC phenotype. Our study reveals unexpectedly complex, structure-driven functions for the MUNC lncRNA and emphasizes the importance of experimentally determined structures for understanding structure-function relationships in lncRNAs.
Project description:New regulatory roles continue to emerge for both natural and engineered noncoding RNAs, many of which have specific secondary and tertiary structures essential to their function. Thus there is a growing need to develop technologies that enable rapid characterization of structural features within complex RNA populations. We have developed a high-throughput technique, SHAPE-Seq, that can simultaneously measure quantitative, single nucleotide-resolution secondary and tertiary structural information for hundreds of RNA molecules of arbitrary sequence. SHAPE-Seq combines selective 2â²-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry with multiplexed paired-end deep sequencing of primer extension products. This generates millions of sequencing reads, which are then analyzed using a fully automated data analysis pipeline, based on a rigorous maximum likelihood model of the SHAPE-Seq experiment. We demonstrate the ability of SHAPE-Seq to accurately infer secondary and tertiary structural information, detect subtle conformational changes due to single nucleotide point mutations, and simultaneously measure the structures of a complex pool of different RNA molecules. SHAPE-Seq thus represents a powerful step toward making the study of RNA secondary and tertiary structures high throughput and accessible to a wide array of scientific pursuits, from fundamental biological investigations to engineering RNA for synthetic biological systems. We sequenced in-vitro transcribed RNaseP in wild type and barcoded variants and probed them with SHAPE-Seq. Comparison to existing crystalography data and previous SHAPE experiments verified the accuracy of the technique and the absence of bias due to our multiplexiing strategy.
Project description:Accurate predictions of the DNA binding specificities of transcription factors (TFs) are necessary for understanding gene regulatory mechanisms. Traditionally, predictive models are built based on nucleotide sequence features. Here, we employed three- dimensional DNA shape information obtained on a high-throughput basis to integrate intuitive DNA structural features into the modeling of TF binding specificities using support vector regression. We performed quantitative predictions of DNA binding specificities, using the DREAM5 dataset for 65 mouse TFs and genomic-context protein binding microarray data for three human basic helix-loop-helix TFs. DNA shape-augmented models compared favorably with sequence-based models for these predictions. Although both k-mer and DNA shape features encoded the interdependencies between nucleotide positions of the binding site, using DNA shape features reduced the dimensionality of the feature space compared to k-mer use. Finally, analyzing the weights of DNA shape-augmented models uncovered TF family- specific structural readout mechanisms that were not obvious from the nucleotide sequence.
Project description:SHAPE-MaP structure probing experiment was performed on SARS-CoV-2 infected Vero cells at 4 days post infection with two biological replicates. For each replciate, SHAPE-MaP includes a sample treated with 2-methylnicotinic acid imidazolide acid (modified) or a minue reagent (unmodified). NAI preferentially reacts with unpaired bases in RNA, forming acylated bases. These modifications are encoded as mutation during reverse transcripatse and library preparation. After sequencing and alignment, the reactivity profiles of 'modified' and 'unmodified' samples are used to calculate SHAPE reactivity of each base