Unknown

Dataset Information

0

SPUMONI 2: improved classification using a pangenome index of minimizer digests.


ABSTRACT: Genomics analyses use large reference sequence collections, like pangenomes or taxonomic databases. SPUMONI 2 is an efficient tool for sequence classification of both short and long reads. It performs multi-class classification using a novel sampled document array. By incorporating minimizers, SPUMONI 2's index is 65 times smaller than minimap2's for a mock community pangenome. SPUMONI 2 achieves a speed improvement of 3-fold compared to SPUMONI and 15-fold compared to minimap2. We show SPUMONI 2 achieves an advantageous mix of accuracy and efficiency in practical scenarios such as adaptive sampling, contamination detection and multi-class metagenomics classification.

SUBMITTER: Ahmed OY 

PROVIDER: S-EPMC10197461 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

SPUMONI 2: improved classification using a pangenome index of minimizer digests.

Ahmed Omar Y OY   Rossi Massimiliano M   Gagie Travis T   Boucher Christina C   Langmead Ben B  

Genome biology 20230518 1


Genomics analyses use large reference sequence collections, like pangenomes or taxonomic databases. SPUMONI 2 is an efficient tool for sequence classification of both short and long reads. It performs multi-class classification using a novel sampled document array. By incorporating minimizers, SPUMONI 2's index is 65 times smaller than minimap2's for a mock community pangenome. SPUMONI 2 achieves a speed improvement of 3-fold compared to SPUMONI and 15-fold compared to minimap2. We show SPUMONI  ...[more]

Similar Datasets

| S-EPMC11522871 | biostudies-literature
| S-EPMC10462034 | biostudies-literature
| S-EPMC10538492 | biostudies-literature
| S-EPMC7343994 | biostudies-literature
| S-EPMC8760417 | biostudies-literature
| S-EPMC7320612 | biostudies-literature
| S-EPMC10791072 | biostudies-literature
| S-EPMC11465122 | biostudies-literature
| S-EPMC11696632 | biostudies-literature
2008-01-22 | GSE8595 | GEO