Unknown

Dataset Information

0

BitMapper: an efficient all-mapper based on bit-vector computing.


ABSTRACT:

Background

As the next-generation sequencing (NGS) technologies producing hundreds of millions of reads every day, a tremendous computational challenge is to map NGS reads to a given reference genome efficiently. However, existing methods of all-mappers, which aim at finding all mapping locations of each read, are very time consuming. The majority of existing all-mappers consist of 2 main parts, filtration and verification. This work significantly reduces verification time, which is the dominant part of the running time.

Results

An efficient all-mapper, BitMapper, is developed based on a new vectorized bit-vector algorithm, which simultaneously calculates the edit distance of one read to multiple locations in a given reference genome. Experimental results on both simulated and real data sets show that BitMapper is from several times to an order of magnitude faster than the current state-of-the-art all-mappers, while achieving higher sensitivity, i.e., better quality solutions.

Conclusions

We present BitMapper, which is designed to return all mapping locations of raw reads containing indels as well as mismatches. BitMapper is implemented in C under a GPL license. Binaries are freely available at http://home.ustc.edu.cn/%7Echhy.

SUBMITTER: Cheng H 

PROVIDER: S-EPMC4462005 | biostudies-literature | 2015 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

BitMapper: an efficient all-mapper based on bit-vector computing.

Cheng Haoyu H   Jiang Huaipan H   Yang Jiaoyun J   Xu Yun Y   Shang Yi Y  

BMC bioinformatics 20150611


<h4>Background</h4>As the next-generation sequencing (NGS) technologies producing hundreds of millions of reads every day, a tremendous computational challenge is to map NGS reads to a given reference genome efficiently. However, existing methods of all-mappers, which aim at finding all mapping locations of each read, are very time consuming. The majority of existing all-mappers consist of 2 main parts, filtration and verification. This work significantly reduces verification time, which is the  ...[more]

Similar Datasets

| S-EPMC4866519 | biostudies-literature
2009-12-31 | GSE19702 | GEO
| S-EPMC9455019 | biostudies-literature
| S-EPMC4217332 | biostudies-other
| S-EPMC5068316 | biostudies-literature
| S-EPMC92613 | biostudies-literature
| S-EPMC4688996 | biostudies-literature
| S-EPMC5774832 | biostudies-literature
| S-EPMC11018740 | biostudies-literature
| S-EPMC4538387 | biostudies-other