Unknown

Dataset Information

0

DART: a fast and accurate RNA-seq mapper with a partitioning strategy.


ABSTRACT: MOTIVATION:In recent years, the massively parallel cDNA sequencing (RNA-Seq) technologies have become a powerful tool to provide high resolution measurement of expression and high sensitivity in detecting low abundance transcripts. However, RNA-seq data requires a huge amount of computational efforts. The very fundamental and critical step is to align each sequence fragment against the reference genome. Various de novo spliced RNA aligners have been developed in recent years. Though these aligners can handle spliced alignment and detect splice junctions, some challenges still remain to be solved. With the advances in sequencing technologies and the ongoing collection of sequencing data in the ENCODE project, more efficient alignment algorithms are highly demanded. Most read mappers follow the conventional seed-and-extend strategy to deal with inexact matches for sequence alignment. However, the extension is much more time consuming than the seeding step. RESULTS:We proposed a novel RNA-seq de novo mapping algorithm, call DART, which adopts a partitioning strategy to avoid the extension step. The experiment results on synthetic datasets and real NGS datasets showed that DART is a highly efficient aligner that yields the highest or comparable sensitivity and accuracy compared to most state-of-the-art aligners, and more importantly, it spends the least amount of time among the selected aligners. AVAILABILITY AND IMPLEMENTATION:https://github.com/hsinnan75/DART. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.

SUBMITTER: Lin HN 

PROVIDER: S-EPMC5860201 | biostudies-literature | 2018 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

DART: a fast and accurate RNA-seq mapper with a partitioning strategy.

Lin Hsin-Nan HN   Hsu Wen-Lian WL  

Bioinformatics (Oxford, England) 20180101 2


<h4>Motivation</h4>In recent years, the massively parallel cDNA sequencing (RNA-Seq) technologies have become a powerful tool to provide high resolution measurement of expression and high sensitivity in detecting low abundance transcripts. However, RNA-seq data requires a huge amount of computational efforts. The very fundamental and critical step is to align each sequence fragment against the reference genome. Various de novo spliced RNA aligners have been developed in recent years. Though thes  ...[more]

Similar Datasets

| S-EPMC4411664 | biostudies-literature
| S-EPMC3167048 | biostudies-literature
| S-EPMC4673974 | biostudies-literature
| S-EPMC5720828 | biostudies-literature
| S-EPMC8024626 | biostudies-literature
| S-EPMC4881296 | biostudies-literature
| S-EPMC7276436 | biostudies-literature
| S-EPMC7320720 | biostudies-literature
| S-EPMC6798445 | biostudies-literature
| S-EPMC8825760 | biostudies-literature