Project description:We have recently shown that transcription initiation RNAs (tiRNAs) are derived from sequences downstream of transcription start sites. Here we report the identification of a second class of nuclear-specific ~17-18 nucleotide small RNA whose 3M-bM-^@M-^Y ends map precisely to the splice donor site of internal exons in animals. These splice-site RNAs (spliRNAs) are associated with highly expressed genes, and show evidence of developmental stage- and region-specific expression. We also confirm that tiRNAs are nuclear localized, enriched at chromatin marks associated with transcription initiation, and possess a 3M-bM-^@M-^Y nucleotide bias. Additionally, we find that microRNA-offset RNAs (moRNAs), the oncogenic miR-15/16 cluster and most snoRNA-derived small RNAs (sdRNAs) are enriched in the nucleus, whereas most miRNAs and two H/ACA sdRNAs are cytoplasmically enriched. We propose that nuclear localized tiny RNAs are involved in epigenetic regulation of gene expression. Discovery and characterization of small RNA species through high-througput deep sequencing of nuclear, cytoplasmic and total small RNA fractions from THP-1 cells, a monocytic leukemia cell line. Additional files and information are available at http://matticklab.com/index.php?title=NuclearTinyRNAs
Project description:We have recently shown that transcription initiation RNAs (tiRNAs) are derived from sequences downstream of transcription start sites. Here we report the identification of a second class of nuclear-specific ~17-18 nucleotide small RNA whose 3’ ends map precisely to the splice donor site of internal exons in animals. These splice-site RNAs (spliRNAs) are associated with highly expressed genes, and show evidence of developmental stage- and region-specific expression. We also confirm that tiRNAs are nuclear localized, enriched at chromatin marks associated with transcription initiation, and possess a 3’ nucleotide bias. Additionally, we find that microRNA-offset RNAs (moRNAs), the oncogenic miR-15/16 cluster and most snoRNA-derived small RNAs (sdRNAs) are enriched in the nucleus, whereas most miRNAs and two H/ACA sdRNAs are cytoplasmically enriched. We propose that nuclear localized tiny RNAs are involved in epigenetic regulation of gene expression.
Project description:Pervasive transcription in the mammalian genome produces thousands of long noncoding RNAs (lncRNAs) and promoter- or enhancer-associated unstable transcripts. They preferentially locate to chromatin, at which some regulate chromatin structure, transcription and RNA processing. While several RNA sequences responsible for nuclear localization have been identified, such as repeats in the lncRNA Xist and Alu-like elements for long RNAs, how lncRNAs as a class are enriched on chromatin remains elusive. To screen for cis-elements that contribute to RNA-chromatin localization, we developed a high-throughput method named RNA elements for subcellular localization by sequencing (REL-seq), and discovered a U1 small nuclear ribonucleoprotein (snRNP)-recognition motif being critical for chromatin localization of reporter RNAs. Across the genome, chromatin-bound lncRNAs, which are enriched with 5’ splice sites and depleted of 3’ splice sites, exhibit high levels of U1 snRNA binding compared to cytoplasm-localized protein-coding mRNAs. Acute depletion of U1 snRNA, or U1 snRNP protein component SNRNP70, drastically reduces the chromatin association of hundreds of lncRNAs and unstable transcripts without altering the overall transcription rate in cells. In addition, rapid degradation of SNRNP70 reduces the localization of both nascent and polyadenylated lncRNA transcripts to chromatin, and disrupts the nuclear-speckles and genome-wide localization of Malat1, a highly conserved and abundant lncRNA. Moreover, chromatin-bound U1 snRNP interacts with transcriptionally engaged RNA polymerase (Pol) II. Together, these results demonstrate that U1 snRNP acts widely to tether and mobilize lncRNAs to chromatin in a Pol II transcription-dependent manner. Our findings uncover a novel role of U1 snRNP beyond pre-mRNA processing and provide molecular insights into how lncRNAs are recruited to Pol II-transcribed genes and have a propensity for chromatin-associated functions.
Project description:mRNA processing is critical for gene expression. A challenge in regulating mRNA processing is how to recognize the actual mRNA processing sites such as splice sites and polyadenylation sites when the sequence content is insufficient for this purpose. Previous studies suggested that RNA structure affects mRNA processing. However, the regulatory role of RNA structure in mRNA processing remains unclear. Here, we performed in vivo SHAPE chemical profiling on Arabidopsis nuclear RNAs and generated the in vivo nuclear RNA structure landscape. We found that nuclear mRNAs fold differently from cytosolic mRNAs. Notably, we discovered a two-nucleotide single-stranded RNA structure feature upstream of 5’ splice site that regulates splicing and is responsible for the selection of alternative 5’ splice sites. Moreover, we found branch point single-strandedness is associated with 3’ splice site recognition. We also identified an RNA structure feature comprising two close-by single-stranded regions that is specifically associated with both polyadenylation and alternative polyadenylation events. Our work demonstrates an RNA structure regulatory mechanism for mRNA processing.
Project description:Adenovirus is a common human pathogen that relies on host cell processes for transcription and processing of viral RNA and protein production. Although adenoviral promoters, splice junctions, and cleavage and polyadenylation sites have been characterized using low-throughput biochemical techniques or short read cDNA-based sequencing, these technologies do not fully capture the complexity of the adenoviral transcriptome. By combining Illumina short-read and nanopore long-read direct RNA sequencing approaches, we mapped transcription start sites and cleavage and polyadenylation sites across the adenovirus genome. In addition to confirming the known canonical viral early and late RNA cassettes, our analysis of splice junctions within long RNA reads revealed an additional 35 novel viral transcripts. These RNAs include fourteen new splice junctions which lead to expression of canonical open reading frames (ORF), six novel ORF-containing transcripts, and fifteen transcripts encoding for messages that potentially alter protein functions through truncations or fusion of canonical ORFs. In addition, we also detect RNAs that bypass canonical cleavage sites and generate potential chimeric proteins by linking separate gene transcription units. Of these, an evolutionary conserved protein was detected containing the N-terminus of E4orf6 fused to the downstream DBP/E2A ORF. Loss of this novel protein, E4orf6/DBP, was associated with aberrant viral replication center morphology and poor viral spread. Our work highlights how long-read sequencing technologies can reveal further complexity within viral transcriptomes.
Project description:It has recently been shown that RNA Polymerase II transcription is far more extensive than previously thought, much of it not associated with protein-coding genes. To investigate this phenomenon, we determined the genome-wide landscape of RNA Polymerase II transcription initiation and elongation in C. elegans. We identify 73,500 distinct clusters of transcription initiation and find that initiation is often bidirectional. Strikingly, the majority of initiation events occur in regions with enhancer-like chromatin signatures. We also assign transcription initiation sites to 7691 protein coding genes, the majority previously unknown because of trans-splicing. Through mapping RNA PolII initiation (short capped RNAs) and elongation (long capped RNAs), we provide identification of transcription start sites.
Project description:This is a custom Affymetrix resequencing array for DNA sequencing of the entire coding region and exon-splice sites of 524 human genes (5471 exons; 0.8Mb genomic DNA). These nuclear genes encode proteins localized to mitochondria and include known disease genes (i.e. POLG, LRRK2) and new candidate genes for mitochondrial disorders. We sequenced 63 samples including both cases and controls using this array.
Project description:This is a custom Affymetrix resequencing array for DNA sequencing of the entire coding region and exon-splice sites of 39 human genes (452 exons; 106,337 bases). These nuclear genes encode proteins localized to mitochondria and include known disease genes (i.e. POLG, C10orf2) and new candidate genes for mtDNA maintenance disorders. We sequenced 27 patient (P) and 13 control (C) samples using this array.
Project description:C. Elegans 21U-RNAs are equivalent to the piRNAs discovered in other metazoans and have important roles in gametogenesis and transposon control. The biogenesis and molecular function of 21U-RNAs and piRNAs are poorly understood. Here, we demonstrate that transcription of each 21U-RNA is regulated separately through a conserved upstream DNA motif. We use genomic analysis to show that this motif is associated with low nucleosome occupancy, a characteristic of many promoters that drive expression of protein-coding genes, and that RNA polymerase II is localized to this nucleosome-depleted region. We establish that the most conserved 8-mer sequence in the upstream region of 21U-RNAs, CTGTTTCA, is absolutely required for their individual expression. Furthermore, we demonstrate that the 8-mer is specifically recognized by Fordhead family (FKH) transcription factors and that 21U-RNA expression is diminished in several FKH mutants. Our results demonstrate a novel paradigm for simultaneous regulation of thousands of small non-coding transcription units. Comparisons of H3 and H2B positions on Chromosome 4 relative to the positions of 21U-RNAs.
Project description:The RNA exosome is an RNA degradation machine that is critical for eukaryotic transcriptome surveillance and mutations in exosome components cause numerous human diseases. The exosome is directed to specific RNAs by adaptor protein complexes. However, it remains unclear how these adaptors specifically recognize their target RNAs. The PAXT connection is an adaptor that recruits the exosome to polyadenylated RNAs in the nucleus, especially transcripts polyadenylated at intronic poly(A) sites. Here we show that PAXT-mediated degradation is induced by the combination of a 5′ splice site and a poly(A) junction, which includes the poly(A) signal and the poly(A) tail, but not by either sequence alone. These two sequences are bound by the splicing factor U1 snRNP and pre-mRNA 3′ processing factors, which in turn cooperatively recruit PAXT. As 5′ splice sites and poly(A) junctions are typically found on unspliced precursors and processed RNAs respectively, we propose that their presence on the same RNA molecule constitutes an “RNA degradation code”. Consistent with this model, disease-associated single nucleotide polymorphisms that create novel 5′ splice sites near poly(A) sites induce aberrant RNA degradation. Our results revealed the first nuclear RNA degradation code that plays important roles in transcriptome surveillance and may also influence gene evolution.