Dataset Information

GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

ABSTRACT:

Motivation

High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms.

Results

We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average?>96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10.

Availability and implementation

https://github.com/BilkentCompGen/GateKeeper.

Contact

mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Alser M

PROVIDER: S-EPMC5860160 | biostudies-literature | 2017 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

Alser Mohammed M Hassan Hasan H Xin Hongyi H Ergin Oguz O Mutlu Onur O Alkan Can C

Bioinformatics (Oxford, England) 20171101 21

<h4>Motivation</h4>High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational ...[more]

PMID: 28575161

Dataset Information

GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

Motivation

Results

Availability and implementation

Contact

Supplementary information

Publications

GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Performance optimization in DNA short-read alignment.
| S-EPMC10060706 | biostudies-literature

Accelerating read mapping with FastHASH.
| S-EPMC3549798 | biostudies-literature

Incorporating sequence quality data into alignment improves DNA read mapping.
| S-EPMC2853142 | biostudies-literature

Improving PacBio long read accuracy by short read alignment.
| S-EPMC3464235 | biostudies-literature

Short Read Mapping: An Algorithmic Tour.
| S-EPMC5425171 | biostudies-other

SIGAR: Inferring Features of Genome Architecture and DNA Rearrangements by Split-Read Mapping.
| S-EPMC7586852 | biostudies-literature

LSCplus: a fast solution for improving long read accuracy by short read alignment.
| S-EPMC5103424 | biostudies-literature

Arioc: High-concurrency short-read alignment on multiple GPUs.
| S-EPMC7676696 | biostudies-literature

Fast and accurate short read alignment with Burrows-Wheeler transform.
| S-EPMC2705234 | biostudies-literature

Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis.
| S-EPMC8425420 | biostudies-literature