Ontology highlight
ABSTRACT:
SUBMITTER: Baharav TZ
PROVIDER: S-EPMC7660437 | biostudies-literature | 2020 Sep
REPOSITORIES: biostudies-literature
Baharav Tavor Z TZ Kamath Govinda M GM Tse David N DN Shomorony Ilan I
Patterns (New York, N.Y.) 20200731 6
Pairwise sequence alignment is often a computational bottleneck in genomic analysis pipelines, particularly in the context of third-generation sequencing technologies. To speed up this process, the pairwise <i>k</i>-mer Jaccard similarity is sometimes used as a proxy for alignment size in order to filter pairs of reads, and min-hashes are employed to efficiently estimate these similarities. However, when the <i>k</i>-mer distribution of a dataset is significantly non-uniform (e.g., due to GC bia ...[more]