Ontology highlight
ABSTRACT:
SUBMITTER: Jones DC
PROVIDER: S-EPMC3526293 | biostudies-literature | 2012 Dec
REPOSITORIES: biostudies-literature
Jones Daniel C DC Ruzzo Walter L WL Peng Xinxia X Katze Michael G MG
Nucleic acids research 20120816 22
<h4>Unlabelled</h4>We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ and SAM/BAM formats. In addition to implementing reference-based compression, we have developed, to our knowledge, the first assembly-based compressor, using a novel de novo assembly algorithm. A probabilistic data structure is used to dramatically reduce the memory required by traditional de Bruijn graph assemblers, allowing millions of reads to be assembled very efficiently. Re ...[more]