Unknown

Dataset Information

0

Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ.


ABSTRACT: Pairwise whole-genome homology mapping is the problem of finding all pairs of homologous intervals between a pair of genomes. As the number of available whole genomes has been rising dramatically in the last few years, there has been a need for more scalable homology mappers. In this paper, we develop an algorithm (BubbZ) for computing whole-genome pairwise homology mappings, especially in the context of all-to-all comparison for multiple genomes. BubbZ is based on an algorithm for computing chains in compacted de Bruijn graphs. We evaluate BubbZ on simulated datasets, a dataset composed of 16 long mouse genomes, and a large dataset of 1,600 Salmonella genomes. We show up to approximately an order of magnitude speed improvement, compared with MashMap2 and Minimap2, while retaining similar accuracy.

SUBMITTER: Minkin I 

PROVIDER: S-EPMC7303978 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC3065709 | biostudies-literature
| S-EPMC4118782 | biostudies-literature
| S-EPMC4103496 | biostudies-literature
| S-EPMC4524009 | biostudies-literature
| S-EPMC3762879 | biostudies-literature
| S-EPMC5470542 | biostudies-literature
| S-EPMC9271119 | biostudies-literature
| S-EPMC5066648 | biostudies-other
| S-EPMC3819389 | biostudies-literature
| S-EPMC4856081 | biostudies-literature