Unknown

Dataset Information

0

Discover hidden splicing variations by mapping personal transcriptomes to personal genomes.


ABSTRACT: RNA-seq has become a popular technology for studying genetic variation of pre-mRNA alternative splicing. Commonly used RNA-seq aligners rely on the consensus splice site dinucleotide motifs to map reads across splice junctions. Consequently, genomic variants that create novel splice site dinucleotides may produce splice junction RNA-seq reads that cannot be mapped to the reference genome. We developed and evaluated an approach to identify 'hidden' splicing variations in personal transcriptomes, by mapping personal RNA-seq data to personal genomes. Computational analysis and experimental validation indicate that this approach identifies personal specific splice junctions at a low false positive rate. Applying this approach to an RNA-seq data set of 75 individuals, we identified 506 personal specific splice junctions, among which 437 were novel splice junctions not documented in current human transcript annotations. 94 splice junctions had splice site SNPs associated with GWAS signals of human traits and diseases. These involve genes whose splicing variations have been implicated in diseases (such as OAS1), as well as novel associations between alternative splicing and diseases (such as ICA1). Collectively, our work demonstrates that the personal genome approach to RNA-seq read alignment enables the discovery of a large but previously unknown catalog of splicing variations in human populations.

SUBMITTER: Stein S 

PROVIDER: S-EPMC4678817 | biostudies-literature | 2015 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Discover hidden splicing variations by mapping personal transcriptomes to personal genomes.

Stein Shayna S   Lu Zhi-Xiang ZX   Bahrami-Samani Emad E   Park Juw Won JW   Xing Yi Y  

Nucleic acids research 20151117 22


RNA-seq has become a popular technology for studying genetic variation of pre-mRNA alternative splicing. Commonly used RNA-seq aligners rely on the consensus splice site dinucleotide motifs to map reads across splice junctions. Consequently, genomic variants that create novel splice site dinucleotides may produce splice junction RNA-seq reads that cannot be mapped to the reference genome. We developed and evaluated an approach to identify 'hidden' splicing variations in personal transcriptomes,  ...[more]

Similar Datasets

| S-EPMC3232370 | biostudies-literature
| S-EPMC8466565 | biostudies-literature
| S-EPMC3936741 | biostudies-literature
| S-EPMC3595401 | biostudies-other
| S-EPMC3083083 | biostudies-other
| S-EPMC2901138 | biostudies-literature
| S-EPMC4702856 | biostudies-literature
| S-EPMC4856438 | biostudies-literature
| S-EPMC4713245 | biostudies-literature
| S-EPMC4110416 | biostudies-literature