Unknown

Dataset Information

0

Data compression for sequencing data.


ABSTRACT: : Post-Sanger sequencing methods produce tons of data, and there is a general agreement that the challenge to store and process them must be addressed with data compression. In this review we first answer the question "why compression" in a quantitative manner. Then we also answer the questions "what" and "how", by sketching the fundamental compression ideas, describing the main sequencing data types and formats, and comparing the specialized compression algorithms and tools. Finally, we go back to the question "why compression" and give other, perhaps surprising answers, demonstrating the pervasiveness of data compression techniques in computational biology.

SUBMITTER: Deorowicz S 

PROVIDER: S-EPMC3868316 | biostudies-literature | 2013 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Data compression for sequencing data.

Deorowicz Sebastian S   Grabowski Szymon S  

Algorithms for molecular biology : AMB 20131118 1


: Post-Sanger sequencing methods produce tons of data, and there is a general agreement that the challenge to store and process them must be addressed with data compression. In this review we first answer the question "why compression" in a quantitative manner. Then we also answer the questions "what" and "how", by sketching the fundamental compression ideas, describing the main sequencing data types and formats, and comparing the specialized compression algorithms and tools. Finally, we go back  ...[more]

Similar Datasets

| S-EPMC3832420 | biostudies-literature
| S-EPMC3606433 | biostudies-literature
| S-EPMC6969201 | biostudies-literature
| S-EPMC5946873 | biostudies-literature
| S-EPMC4547610 | biostudies-literature
| S-EPMC3592443 | biostudies-literature
| S-EPMC3083090 | biostudies-literature
| S-EPMC4570262 | biostudies-literature
| S-EPMC6547476 | biostudies-literature
| S-EPMC2705231 | biostudies-literature