Unknown

Dataset Information

0

GABAC: an arithmetic coding solution for genomic data.


ABSTRACT: MOTIVATION:In an effort to provide a response to the ever-expanding generation of genomic data, the International Organization for Standardization (ISO) is designing a new solution for the representation, compression and management of genomic sequencing data: the Moving Picture Experts Group (MPEG)-G standard. This paper discusses the first implementation of an MPEG-G compliant entropy codec: GABAC. GABAC combines proven coding technologies, such as context-adaptive binary arithmetic coding, binarization schemes and transformations, into a straightforward solution for the compression of sequencing data. RESULTS:We demonstrate that GABAC outperforms well-established (entropy) codecs in a significant set of cases and thus can serve as an extension for existing genomic compression solutions, such as CRAM. AVAILABILITY AND IMPLEMENTATION:The GABAC library is written in C++. We also provide a command line application which exercises all features provided by the library. GABAC can be downloaded from https://github.com/mitogen/gabac. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.

SUBMITTER: Voges J 

PROVIDER: S-EPMC7141842 | biostudies-literature | 2020 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

GABAC: an arithmetic coding solution for genomic data.

Voges Jan J   Paridaens Tom T   Müntefering Fabian F   Mainzer Liudmila S LS   Bliss Brian B   Yang Mingyu M   Ochoa Idoia I   Fostier Jan J   Ostermann Jörn J   Hernaez Mikel M  

Bioinformatics (Oxford, England) 20200401 7


<h4>Motivation</h4>In an effort to provide a response to the ever-expanding generation of genomic data, the International Organization for Standardization (ISO) is designing a new solution for the representation, compression and management of genomic sequencing data: the Moving Picture Experts Group (MPEG)-G standard. This paper discusses the first implementation of an MPEG-G compliant entropy codec: GABAC. GABAC combines proven coding technologies, such as context-adaptive binary arithmetic cod  ...[more]

Similar Datasets

| S-EPMC2792350 | biostudies-other
| S-EPMC5131820 | biostudies-literature
| S-EPMC2683180 | biostudies-literature
| S-EPMC3817272 | biostudies-literature
| PRJEB26324 | ENA
| S-EPMC6558229 | biostudies-literature
| S-EPMC4750293 | biostudies-literature
| S-EPMC2553244 | biostudies-literature
| S-EPMC7002335 | biostudies-literature
| S-EPMC5874180 | biostudies-literature