Unknown

Dataset Information

0

Effects of spaced k-mers on alignment-free genotyping.


ABSTRACT:

Motivation

Alignment-free, k-mer based genotyping methods are a fast alternative to alignment-based methods and are particularly well suited for genotyping larger cohorts. The sensitivity of algorithms, that work with k-mers, can be increased by using spaced seeds, however, the application of spaced seeds in k-mer based genotyping methods has not been researched yet.

Results

We add a spaced seeds functionality to the genotyping software PanGenie and use it to calculate genotypes. This significantly improves sensitivity and F-score when genotyping SNPs, indels, and structural variants on reads with low (5×) and high (30×) coverage. Improvements are greater than what could be achieved by just increasing the length of contiguous k-mers. Effect sizes are particularly large for low coverage data. If applications implement effective algorithms for hashing of spaced k-mers, spaced k-mers have the potential to become an useful technique in k-mer based genotyping.

Availability and implementation

The source code of our proposed tool MaskedPanGenie is openly available on https://github.com/hhaentze/MaskedPangenie.

SUBMITTER: Hantze H 

PROVIDER: S-EPMC10311327 | biostudies-literature | 2023 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Effects of spaced k-mers on alignment-free genotyping.

Häntze Hartmut H   Horton Paul P  

Bioinformatics (Oxford, England) 20230601 39 Suppl 1


<h4>Motivation</h4>Alignment-free, k-mer based genotyping methods are a fast alternative to alignment-based methods and are particularly well suited for genotyping larger cohorts. The sensitivity of algorithms, that work with k-mers, can be increased by using spaced seeds, however, the application of spaced seeds in k-mer based genotyping methods has not been researched yet.<h4>Results</h4>We add a spaced seeds functionality to the genotyping software PanGenie and use it to calculate genotypes.  ...[more]

Similar Datasets

| S-EPMC4080745 | biostudies-literature
| S-EPMC7382288 | biostudies-literature
| S-EPMC9531401 | biostudies-literature
| S-EPMC11755882 | biostudies-literature
| S-EPMC6330006 | biostudies-literature
| S-EPMC3799466 | biostudies-literature
| S-EPMC3610899 | biostudies-literature
| S-EPMC10594774 | biostudies-literature
| S-EPMC5793812 | biostudies-literature
| S-EPMC5994939 | biostudies-literature