Project description:Motivation: Alternative cleavage and polyadenylation generates mRNA 3´ isoforms in a cell type- and tissue-specific manner. Due to finite available RNA sequencing data of organisms with vast cell type complexity, currently available gene annotation resources are incomplete, which poses significant challenges to the comprehensive interpretation and quantification of transcriptomes. Results: We developed 3'GAmES, a stand-alone analysis pipeline to identify and annotate novel (cell-type-specific) mRNA 3´ isoforms from 3' mRNA sequencing datasets. When applied to mouse embryonic stem cells or Zebrafish embryos, 3'GAmES expands currently available mRNA 3' annotations by 47% and 57%, respectively; and the resulting annotations significantly improve comprehensive gene-tag counting by cost-effective 3' mRNA sequencing to more accurately mirror whole-transcriptome RNAseq measurements. As a stand-alone analysis tool, 3'GAmES systematically augments cell type-specific transcript annotations and increases the robustness of quantitative gene expression profiling by 3' mRNA sequencing.
Project description:Alternative cleavage and polyadenylation generates mRNA 3' isoforms in a cell type-specific manner. Due to finite available RNA sequencing data of organisms with vast cell type complexity, currently available gene annotation resources are incomplete, which poses significant challenges to the comprehensive interpretation and quantification of transcriptomes. In this chapter, we introduce 3'GAmES, a stand-alone computational pipeline for the identification and quantification of novel mRNA 3'end isoforms from 3'mRNA sequencing data. 3'GAmES expands available repositories and improves comprehensive gene-tag counting by cost-effective 3' mRNA sequencing, faithfully mirroring whole-transcriptome RNAseq measurements. By employing R and bash shell scripts (assembled in a Singularity container) 3'GAmES systematically augments cell type-specific 3' ends of RNA polymerase II transcripts and increases the sensitivity of quantitative gene expression profiling by 3' mRNA sequencing. Public access: https://github.com/AmeresLab/3-GAmES.git.
Project description:The purpose of these experiments was to determine the 5' and 3' transcriptional termini of genes on chromosomes 21-22. Towards this end, we mapped the poly A + RNA isolated from 12 normal tissues and four cell lines. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf
Project description:Stem cell differentiation involves a global increase in protein synthesis to meet the demands of specialized cell types. However, the molecular mechanisms underlying this translational burst and the involvement of initiation factors remains largely unknown. Here, we investigate the roles of eukaryotic initiation factor 3 (eIF3) in early differentiation of human pluripotent stem cell (hPSC)-derived neural progenitor cells (NPCs). Using Quick-irCLIP and alternative polyadenylation (APA)-Seq, we show eIF3 crosslinks to many neurologically relevant mRNAs in NPCs. Our data reveal eIF3 predominantly interacts with 3’ untranslated region (3’-UTR) termini of multiple mRNA isoforms, adjacent to the poly(A) tail. High eIF3 crosslinking at 3’-UTR termini of mRNAs correlates with high translational activity, as determined by ribosome profiling. We identify the transcriptional regulator inhibitor of DNA binding 2 (ID2) mRNA as a case in which active translation levels and eIF3 crosslinking are dramatically increased upon early NPC differentiation. Furthermore, we find that eIF3 engagement at 3’-UTR ends is dependent on polyadenylation. The results presented here show that eIF3 engages with 3’-UTR termini in highly translated mRNAs, supporting a role of mRNA circularization in the mechanisms governing mRNA translation in NPCs.
Project description:Stem cell differentiation involves a global increase in protein synthesis to meet the demands of specialized cell types. However, the molecular mechanisms underlying this translational burst and the involvement of initiation factors remains largely unknown. Here, we investigate the roles of eukaryotic initiation factor 3 (eIF3) in early differentiation of human pluripotent stem cell (hPSC)-derived neural progenitor cells (NPCs). Using Quick-irCLIP and alternative polyadenylation (APA)-Seq, we show eIF3 crosslinks to many neurologically relevant mRNAs in NPCs. Our data reveal eIF3 predominantly interacts with 3’ untranslated region (3’-UTR) termini of multiple mRNA isoforms, adjacent to the poly(A) tail. High eIF3 crosslinking at 3’-UTR termini of mRNAs correlates with high translational activity, as determined by ribosome profiling. We identify the transcriptional regulator inhibitor of DNA binding 2 (ID2) mRNA as a case in which active translation levels and eIF3 crosslinking are dramatically increased upon early NPC differentiation. Furthermore, we find that eIF3 engagement at 3’-UTR ends is dependent on polyadenylation. The results presented here show that eIF3 engages with 3’-UTR termini in highly translated mRNAs, supporting a role of mRNA circularization in the mechanisms governing mRNA translation in NPCs.
Project description:The purpose of these experiments was to determine the 5' and 3' transcriptional termini of genes on chromosomes 21-22. Towards this end, we mapped the poly A + RNA isolated from 12 normal tissues and four cell lines. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf polyA+ RNAs from 11 tissues (Brain Frontal Lobe, Brain Hippocampus, Brain Hypothalamus, Cerebellum, Ovary, Placenta, Prostate, Testis, Fetal Kidney, Fetal Spleen, Fetal Thymus; all from BD Clontech) and 5 cell lines (GM06990, HeLaS3, HepG2, K562, tert-BJ) using the BD SMARTTM RACE cDNA amplification kit (BD Clontech Cat. No.634914). A single array with no replicates was run for each sample with a pool of ~23-24 RACE reactions per sample.
Project description:Forum domains are stretches of chromosomal DNA that are excised from eukaryotic chromosomes during their spontaneous non-random fragmentation. Mostly forum domains are of 50-200 kb in length, although larger domains, up to 500 - 700 kb, are also observed. We performed a genome-wide mapping of forum domains termini in human HEK293T cells cultured cells using deep sequencing of the termini. We found that forum domains termini correspond to the fragile sites in human chromosomes and forum domains contain clusters of several or many genes inside. The largest forum domains correspond to the coordinately expressed main clusters of HOX genes genes. Our results indicate that forum domains correspond to big multi-gene chromosomal units some of which could be co-coordinately activated or repressed. 2 sample examined: forum termini from HEK293T cells in two independent experiments
Project description:Understanding the physiological relevance of structures in mammalian mRNAs remains elusive, especially considering the global unfolding of mRNA structures in eukaryotic organisms recently examined, as well as the decade-long observation that mRNAs generally seem no more likely than random sequences to be stably folded. Here we show that RNA secondary structures, mostly weak and close-to-random, facilitate the 3′-end processing of thousands of human mRNAs by juxtaposing poly(A) signals (PASs) and cleavage sites that are otherwise too far apart. Folding of these 3′-end structures also enhances mRNA stability. Global structure probing shows that 3′-end regions are indeed folded in cells despite substantial unfolding of PAS-upstream regions. Analyses of thousands of ectopically expressed variants prove that folding both enhances processing and increases stability. Mutagenesis of a genomic locus further implicates structure-controlled processing in regulating neighboring gene expression. These results reveal widespread roles for RNA structure in mammalian mRNA biogenesis and metabolism.