Unknown

Dataset Information

0

Kart: a divide-and-conquer algorithm for NGS read alignment.


ABSTRACT: Next-generation sequencing (NGS) provides a great opportunity to investigate genome-wide variation at nucleotide resolution. Due to the huge amount of data, NGS applications require very fast and accurate alignment algorithms. Most existing algorithms for read mapping basically adopt seed-and-extend strategy, which is sequential in nature and takes much longer time on longer reads.We develop a divide-and-conquer algorithm, called Kart, which can process long reads as fast as short reads by dividing a read into small fragments that can be aligned independently. Our experiment result indicates that the average size of fragments requiring the more time-consuming gapped alignment is around 20?bp regardless of the original read length. Furthermore, it can tolerate much higher error rates. The experiments show that Kart spends much less time on longer reads than other aligners and still produce reliable alignments even when the error rate is as high as 15%.Kart is available at https://github.com/hsinnan75/Kart/ .hsu@iis.sinica.edu.tw.Supplementary data are available at Bioinformatics online.

SUBMITTER: Lin HN 

PROVIDER: S-EPMC5860120 | biostudies-literature | 2017 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Kart: a divide-and-conquer algorithm for NGS read alignment.

Lin Hsin-Nan HN   Hsu Wen-Lian WL  

Bioinformatics (Oxford, England) 20170801 15


<h4>Motivation</h4>Next-generation sequencing (NGS) provides a great opportunity to investigate genome-wide variation at nucleotide resolution. Due to the huge amount of data, NGS applications require very fast and accurate alignment algorithms. Most existing algorithms for read mapping basically adopt seed-and-extend strategy, which is sequential in nature and takes much longer time on longer reads.<h4>Results</h4>We develop a divide-and-conquer algorithm, called Kart, which can process long re  ...[more]

Similar Datasets

| S-EPMC7446356 | biostudies-literature
| S-EPMC1952108 | biostudies-literature
| S-EPMC2853773 | biostudies-literature
| S-EPMC8036003 | biostudies-literature
| S-EPMC6781626 | biostudies-literature
| S-EPMC7727519 | biostudies-literature
| S-EPMC5679152 | biostudies-literature
2013-07-15 | E-MTAB-1728 | biostudies-arrayexpress
| S-EPMC3936251 | biostudies-literature
| S-EPMC3758368 | biostudies-literature