Unknown

Dataset Information

0

VariantStore: an index for large-scale genomic variant search.


ABSTRACT: Efficiently scaling genomic variant search indexes to thousands of samples is computationally challenging due to the presence of multiple coordinate systems to avoid reference biases. We present VariantStore, a system that indexes genomic variants from multiple samples using a variation graph and enables variant queries across any sample-specific coordinate system. We show the scalability of VariantStore by indexing genomic variants from the TCGA project in 4 h and the 1000 Genomes project in 3 h. Querying for variants in a gene takes between 0.002 and 3 seconds using memory only 10% of the size of the full representation.

SUBMITTER: Pandey P 

PROVIDER: S-EPMC8375130 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC6546127 | biostudies-literature
| S-EPMC4271471 | biostudies-literature
| S-EPMC5527877 | biostudies-other
| S-EPMC5795011 | biostudies-literature
| S-EPMC5872823 | biostudies-literature
| S-EPMC2736654 | biostudies-literature
| S-EPMC1361710 | biostudies-literature
| S-EPMC8469100 | biostudies-literature
| S-EPMC2986175 | biostudies-literature
| S-EPMC3294368 | biostudies-literature