Dataset Information

A probabilistic framework for aligning paired-end RNA-seq data.

ABSTRACT:

Motivation

The RNA-seq paired-end read (PER) protocol samples transcript fragments longer than the sequencing capability of today's technology by sequencing just the two ends of each fragment. Deep sampling of the transcriptome using the PER protocol presents the opportunity to reconstruct the unsequenced portion of each transcript fragment using end reads from overlapping PERs, guided by the expected length of the fragment.

Methods

A probabilistic framework is described to predict the alignment to the genome of all PER transcript fragments in a PER dataset. Starting from possible exonic and spliced alignments of all end reads, our method constructs potential splicing paths connecting paired ends. An expectation maximization method assigns likelihood values to all splice junctions and assigns the most probable alignment for each transcript fragment.

Results

The method was applied to 2 x 35 bp PER datasets from cancer cell lines MCF-7 and SUM-102. PER fragment alignment increased the coverage 3-fold compared to the alignment of the end reads alone, and increased the accuracy of splice detection. The accuracy of the expectation maximization (EM) algorithm in the presence of alternative paths in the splice graph was validated by qRT-PCR experiments on eight exon skipping alternative splicing events. PER fragment alignment with long-range splicing confirmed 8 out of 10 fusion events identified in the MCF-7 cell line in an earlier study by (Maher et al., 2009).

Availability

Software available at http://www.netlab.uky.edu/p/bioinfo/MapSplice/PER.

SUBMITTER: Hu Y

PROVIDER: S-EPMC2916723 | biostudies-literature | 2010 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A probabilistic framework for aligning paired-end RNA-seq data.

Hu Yin Y Wang Kai K He Xiaping X Chiang Derek Y DY Prins Jan F JF Liu Jinze J

Bioinformatics (Oxford, England) 20100623 16

<h4>Motivation</h4>The RNA-seq paired-end read (PER) protocol samples transcript fragments longer than the sequencing capability of today's technology by sequencing just the two ends of each fragment. Deep sampling of the transcriptome using the PER protocol presents the opportunity to reconstruct the unsequenced portion of each transcript fragment using end reads from overlapping PERs, guided by the expected length of the fragment.<h4>Methods</h4>A probabilistic framework is described to predic ...[more]

PMID: 20576625

Dataset Information

A probabilistic framework for aligning paired-end RNA-seq data.

Motivation

Methods

Results

Availability

Publications

A probabilistic framework for aligning paired-end RNA-seq data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A fast detection of fusion genes from paired-end RNA-seq data.
| S-EPMC6211471 | biostudies-literature

Detection of splice junctions from paired-end RNA-seq data by SpliceMap.
| S-EPMC2919714 | biostudies-literature

SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data.
| S-EPMC4054009 | biostudies-literature

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data.
| S-EPMC3218660 | biostudies-literature

ChimeRScope: a novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data.
| S-EPMC5737728 | biostudies-literature

Differential expression analysis for paired RNA-Seq data.
| S-EPMC3663822 | biostudies-literature

PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data.
| S-EPMC3278765 | biostudies-literature

QUANTIFYING ALTERNATIVE SPLICING FROM PAIRED-END RNA-SEQUENCING DATA.
| S-EPMC4005600 | biostudies-literature

Improved characterization of single-cell RNA-seq libraries with paired-end avidity sequencing.
| S-EPMC11257511 | biostudies-literature

Analysis of paired end Pol II ChIP-seq and short capped RNA-seq in MCF-7 cells.
| S-EPMC4516138 | biostudies-literature