Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

0

Comparative Analysis of RNA-Seq Alignment Algorithms and the RNA-Seq Unified Mapper (RUM).


ABSTRACT: A critical task in high throughput sequencing is aligning millions of short reads to a reference genome. Alignment is especially complicated for RNA sequencing (RNA-Seq) because of RNA splicing. A number of RNA-Seq algorithms are available, and claim to align reads with high accuracy and efficiency while detecting splice junctions. RNA-Seq data is discrete in nature; therefore with reasonable gene models and comparative metrics RNA-Seq data can be simulated to sufficient accuracy to enable meaningful benchmarking of alignment algorithms. The exercise to rigorously compare all viable published RNA-Seq algorithms has not previously been performed. RESULTS: We developed an RNA-Seq simulator that models the main impediments to RNA alignment, including alternative splicing, insertions, deletions, substitutions, sequencing errors, and intron signal. We used this simulator to measure the accuracy and robustness of available algorithms at the base and junction levels. Additionally, we used RT-PCR and Sanger sequencing to validate the ability of the algorithms to detect novel transcript features such as novel exons and alternative splicing in RNA-Seq data from mouse retina. A pipeline based on BLAT was developed to explore the performance of established tools for this problem, and to compare it to the recently developed methods. This pipeline, the RNA-Seq Unified Mapper (RUM) performs comparably to the best current aligners and provides an advantageous combination of accuracy, speed and usability. RNA-Seq of mouse retinal RNA, as described.

ORGANISM(S): Mus musculus

SUBMITTER: Eric Pierce 

PROVIDER: E-GEOD-26248 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

altmetric image

Publications

Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM).

Grant Gregory R GR   Farkas Michael H MH   Pizarro Angel D AD   Lahens Nicholas F NF   Schug Jonathan J   Brunk Brian P BP   Stoeckert Christian J CJ   Hogenesch John B JB   Pierce Eric A EA  

Bioinformatics (Oxford, England) 20110719 18


<h4>Motivation</h4>A critical task in high-throughput sequencing is aligning millions of short reads to a reference genome. Alignment is especially complicated for RNA sequencing (RNA-Seq) because of RNA splicing. A number of RNA-Seq algorithms are available, and claim to align reads with high accuracy and efficiency while detecting splice junctions. RNA-Seq data are discrete in nature; therefore, with reasonable gene models and comparative metrics RNA-Seq data can be simulated to sufficient acc  ...[more]

Similar Datasets

2011-08-03 | GSE26248 | GEO
2023-12-25 | GSE227911 | GEO
2014-06-10 | GSE45684 | GEO
2014-06-10 | E-GEOD-45684 | biostudies-arrayexpress
2021-02-28 | GSE116291 | GEO
| PRJNA135077 | ENA
2023-08-10 | GSE194237 | GEO
| PRJEB15488 | ENA
| PRJEB12852 | ENA
2014-03-18 | E-GEOD-50246 | biostudies-arrayexpress