Project description:Alternative transcription start sites (TSSs) usage plays a critical role in gene transcription regulation in mammals. However, precisely identifying alternative TSSs remains challenging at the genome-wide level. Here, we report a single-cell genomic technology for alternative TSSs annotation and cell heterogeneity detection. Here, we utilize Fluidigm C1 system to capture individual cells of interest, SMARTer cDNA synthesis kit to recover full-length cDNA, then dual priming oligonucleotide system to specifically enrich 5’-end tags for genomic analysis.
Project description:More than half of human protein-coding genes have an alternative transcription start site (TSS). We aimed to investigate the contribution of alternative TSSs to the acute-stress–induced transcriptome response in human tissue (skeletal muscle) using the cap analysis of gene expression approach. TSSs were examined at baseline and during recovery after acute stress (a cycling exercise). We identified 44,680 CAGE TSS clusters (including 3,764 first defined) belonging to 12,268 genes and annotated for the first time 290 TSSs belonging to 163 genes. The transcriptome dynamically changes during the first hours after acute stress; the change in the expression of 10% of genes was associated with the activation of alternative TSSs, indicating differential TSSs usage. The majority of the alternative TSSs do not increase proteome complexity suggesting that the function of thousands of alternative TSSs is associated with the fine regulation of mRNA isoform expression from a gene due to the transcription factor-specific activation of various alternative TSSs. We identified individual muscle promoter regions for each TSS using muscle open chromatin data (ATAC-seq and DNase-seq). Then, using the positional weight matrix approach we predicted time course activation of “classic” transcription factors involved in response of skeletal muscle to contractile activity, as well as diversity of less/un-investigated factors. Transcriptome response induced by acute stress related to activation of the alternative TSSs indicates that differential TSSs usage is an essential mechanism of fine regulation of gene response to stress stimulus. A comprehensive resource of accurate TSSs and individual promoter regions for each TSS in muscle was created. This resource together with the positional weight matrix approach can be used to accurate prediction of TFs in any gene(s) of interest involved in the response to various stimuli, interventions or pathological conditions in human skeletal muscle.
Project description:We addressed the lack of experimentally supported transcript annotations in the Rhesus macaque genome by ab initio identification of the transcription start sites (TSSs). We took advantage of histone H3 lysine 4 trimethylation (H3K4me3)'s ability to mark TSSs and the recently developed ChIP-Seq and RNA-Seq technology to survey the transcript structures in the macaque brain. We then integrated the two types of our newly generated data with genomic sequence features and extended a TSS prediction algorithm to ab initio predict and verify 16,833 of previously electronically annotated transcription start sites at 500 bp resolution and predicted ~10,000 new TSSs. We took advantage of histone H3 lysine 4 trimethylation (H3K4me3)M-bM-^@M-^Ys ability to mark transcription start sites (TSSs) and the recently developed ChIP-Seq and RNA-Seq technology to survey the transcript structures. By integrating the ChIP-seq, RNA-seq and small RNA-seq data (previously uploaded to GEO as GSM450615 by our collaborator) with genomic sequence features and extending and improving a state-of-the-art TSS prediction algorithm, we ab initio predicted and verified previously electronically annotated TSSs at a high resolution, and predicted some novel TSSs.
Project description:We addressed the lack of experimentally supported transcript annotations in the Rhesus macaque genome by ab initio identification of the transcription start sites (TSSs). We took advantage of histone H3 lysine 4 trimethylation (H3K4me3)'s ability to mark TSSs and the recently developed ChIP-Seq and RNA-Seq technology to survey the transcript structures in the macaque brain. We then integrated the two types of our newly generated data with genomic sequence features and extended a TSS prediction algorithm to ab initio predict and verify 16,833 of previously electronically annotated transcription start sites at 500 bp resolution and predicted ~10,000 new TSSs.