Ontology highlight
ABSTRACT:
SUBMITTER: Pritt J
PROVIDER: S-EPMC5027496 | biostudies-literature | 2016 Sep
REPOSITORIES: biostudies-literature
Nucleic acids research 20160613 16
We describe Boiler, a new software tool for compressing and querying large collections of RNA-seq alignments. Boiler discards most per-read data, keeping only a genomic coverage vector plus a few empirical distributions summarizing the alignments. Since most per-read data is discarded, storage footprint is often much smaller than that achieved by other compression tools. Despite this, the most relevant per-read data can be recovered; we show that Boiler compression has only a slight negative imp ...[more]