Unknown

Dataset Information

0

Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.


ABSTRACT: The de Bruijn graph is a key data structure in modern computational genomics, and construction of its compacted variant resides upstream of many genomic analyses. As the quantity of genomic data grows rapidly, this often forms a computational bottleneck. We present Cuttlefish 2, significantly advancing the state-of-the-art for this problem. On a commodity server, it reduces the graph construction time for 661K bacterial genomes, of size 2.58Tbp, from 4.5 days to 17-23 h; and it constructs the graph for 1.52Tbp white spruce reads in approximately 10 h, while the closest competitor requires 54-58 h, using considerably more memory.

SUBMITTER: Khan J 

PROVIDER: S-EPMC9454175 | biostudies-literature | 2022 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.

Khan Jamshed J   Kokot Marek M   Deorowicz Sebastian S   Patro Rob R  

Genome biology 20220908 1


The de Bruijn graph is a key data structure in modern computational genomics, and construction of its compacted variant resides upstream of many genomic analyses. As the quantity of genomic data grows rapidly, this often forms a computational bottleneck. We present Cuttlefish 2, significantly advancing the state-of-the-art for this problem. On a commodity server, it reduces the graph construction time for 661K bacterial genomes, of size 2.58Tbp, from 4.5 days to 17-23 h; and it constructs the gr  ...[more]

Similar Datasets

| S-EPMC8275350 | biostudies-literature
| S-EPMC7499882 | biostudies-literature
| S-EPMC8025321 | biostudies-literature
| S-EPMC5872255 | biostudies-literature
| S-EPMC4908363 | biostudies-literature
| S-EPMC9528980 | biostudies-literature
| S-EPMC4120145 | biostudies-literature
| S-EPMC6612864 | biostudies-other
| S-EPMC8326735 | biostudies-literature
| S-EPMC4253301 | biostudies-literature