Unknown

Dataset Information

0

Large-scale analysis of branchpoint usage across species and cell lines.


ABSTRACT: The coding sequence of each human pre-mRNA is interrupted, on average, by 11 introns that must be spliced out for proper gene expression. Each intron contains three obligate signals: a 5' splice site, a branch site, and a 3' splice site. Splice site usage has been mapped exhaustively across different species, cell types, and cellular states. In contrast, only a small fraction of branch sites have been identified even once. The few reported annotations of branch site are imprecise as reverse transcriptase skips several nucleotides while traversing a 2-5 linkage. Here, we report large-scale mapping of the branchpoints from deep sequencing data in three different species and in the SF3B1 K700E oncogenic mutant background. We have developed a novel method whereby raw lariat reads are refined by U2snRNP/pre-mRNA base-pairing models to return the largest current data set of branchpoint sequences with quality metrics. This analysis discovers novel modes of U2snRNA:pre-mRNA base-pairing conserved in yeast and provides insight into the biogenesis of intron circles. Finally, matching branch site usage with isoform selection across the extensive panel of ENCODE RNA-seq data sets offers insight into the mechanisms by which branchpoint usage drives alternative splicing.

SUBMITTER: Taggart AJ 

PROVIDER: S-EPMC5378181 | biostudies-literature | 2017 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Large-scale analysis of branchpoint usage across species and cell lines.

Taggart Allison J AJ   Lin Chien-Ling CL   Shrestha Barsha B   Heintzelman Claire C   Kim Seongwon S   Fairbrother William G WG  

Genome research 20170124 4


The coding sequence of each human pre-mRNA is interrupted, on average, by 11 introns that must be spliced out for proper gene expression. Each intron contains three obligate signals: a 5' splice site, a branch site, and a 3' splice site. Splice site usage has been mapped exhaustively across different species, cell types, and cellular states. In contrast, only a small fraction of branch sites have been identified even once. The few reported annotations of branch site are imprecise as reverse tran  ...[more]

Similar Datasets

| S-EPMC4802229 | biostudies-literature
| S-ECPF-GEOD-24737 | biostudies-other
| S-EPMC4743009 | biostudies-literature
| S-EPMC4510541 | biostudies-literature
2014-05-30 | E-GEOD-49379 | biostudies-arrayexpress
2014-05-30 | GSE49379 | GEO
| S-EPMC6724501 | biostudies-literature
| S-EPMC4124757 | biostudies-literature
| S-EPMC5802054 | biostudies-other
| S-EPMC3422281 | biostudies-literature