Dataset Information

Rapid and precise alignment of raw reads against redundant databases with KMA.

ABSTRACT: BACKGROUND:As the cost of sequencing has declined, clinical diagnostics based on next generation sequencing (NGS) have become reality. Diagnostics based on sequencing will require rapid and precise mapping against redundant databases because some of the most important determinants, such as antimicrobial resistance and core genome multilocus sequence typing (MLST) alleles, are highly similar to one another. In order to facilitate this, a novel mapping method, KMA (k-mer alignment), was designed. KMA is able to map raw reads directly against redundant databases, it also scales well for large redundant databases. KMA uses k-mer seeding to speed up mapping and the Needleman-Wunsch algorithm to accurately align extensions from k-mer seeds. Multi-mapping reads are resolved using a novel sorting scheme (ConClave scheme), ensuring an accurate selection of templates. RESULTS:The functionality of KMA was compared with SRST2, MGmapper, BWA-MEM, Bowtie2, Minimap2 and Salmon, using both simulated data and a dataset of Escherichia coli mapped against resistance genes and core genome MLST alleles. KMA outperforms current methods with respect to both accuracy and speed, while using a comparable amount of memory. CONCLUSION:With KMA, it was possible map raw reads directly against redundant databases with high accuracy, speed and memory efficiency.

SUBMITTER: Clausen PTLC

PROVIDER: S-EPMC6116485 | biostudies-literature | 2018 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Rapid and precise alignment of raw reads against redundant databases with KMA.

Clausen Philip T L C PTLC Aarestrup Frank M FM Lund Ole O

BMC bioinformatics 20180829 1

<h4>Background</h4>As the cost of sequencing has declined, clinical diagnostics based on next generation sequencing (NGS) have become reality. Diagnostics based on sequencing will require rapid and precise mapping against redundant databases because some of the most important determinants, such as antimicrobial resistance and core genome multilocus sequence typing (MLST) alleles, are highly similar to one another. In order to facilitate this, a novel mapping method, KMA (k-mer alignment), was de ...[more]

PMID: 30157759

Dataset Information

Rapid and precise alignment of raw reads against redundant databases with KMA.

Publications

Rapid and precise alignment of raw reads against redundant databases with KMA.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

FASTQuick: rapid and comprehensive quality assessment of raw sequence reads.
| S-EPMC7844880 | biostudies-literature

Simultaneous alignment of short reads against multiple genomes.
| S-EPMC2768987 | biostudies-literature

PatMaN: rapid alignment of short sequences to large databases.
| S-EPMC2718670 | biostudies-literature

FastGT: an alignment-free method for calling common SNVs directly from raw sequencing reads.
| S-EPMC5451431 | biostudies-literature

Rapid and accurate alignment of nucleotide conversion sequencing reads with HISAT-3N.
| S-EPMC8256862 | biostudies-literature

Ophiocordyceps sinensis raw sequence reads
2019-05-14 | GSE123085 | GEO

Oryza sativa Raw sequence reads
2015-04-22 | E-MTAB-4312 | biostudies-arrayexpress

Oryza sativa Raw sequence reads
2015-07-31 | E-MTAB-4347 | biostudies-arrayexpress

Populus tricho Raw sequence reads
2015-06-01 | E-MTAB-4364 | biostudies-arrayexpress

Arabidopsis thaliana Raw sequence reads
2015-06-20 | E-MTAB-4396 | biostudies-arrayexpress