Unknown

Dataset Information

0

Indexing Arbitrary-Length k-Mers in Sequencing Reads.


ABSTRACT: We propose a lightweight data structure for indexing and querying collections of NGS reads data in main memory. The data structure supports the interface proposed in the pioneering work by Philippe et al. for counting and locating k-mers in sequencing reads. Our solution, PgSA (pseudogenome suffix array), based on finding overlapping reads, is competitive to the existing algorithms in the space use, query times, or both. The main applications of our index include variant calling, error correction and analysis of reads from RNA-seq experiments.

SUBMITTER: Kowalski T 

PROVIDER: S-EPMC4504488 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Indexing Arbitrary-Length k-Mers in Sequencing Reads.

Kowalski Tomasz T   Grabowski Szymon S   Deorowicz Sebastian S  

PloS one 20150716 7


We propose a lightweight data structure for indexing and querying collections of NGS reads data in main memory. The data structure supports the interface proposed in the pioneering work by Philippe et al. for counting and locating k-mers in sequencing reads. Our solution, PgSA (pseudogenome suffix array), based on finding overlapping reads, is competitive to the existing algorithms in the space use, query times, or both. The main applications of our index include variant calling, error correctio  ...[more]

Similar Datasets

| S-EPMC6044908 | biostudies-literature
| S-EPMC7842384 | biostudies-literature
| S-EPMC4168702 | biostudies-literature
| S-EPMC9237683 | biostudies-literature
| S-EPMC6154894 | biostudies-literature
| S-EPMC3791270 | biostudies-literature
| S-EPMC7357769 | biostudies-literature
| S-EPMC5908213 | biostudies-literature
| S-EPMC8330242 | biostudies-literature
| S-EPMC8494230 | biostudies-literature