Unknown

Dataset Information

0

SEAL: a distributed short read mapping and duplicate removal tool.


ABSTRACT: SUMMARY: SEAL is a scalable tool for short read pair mapping and duplicate removal. It computes mappings that are consistent with those produced by BWA and removes duplicates according to the same criteria employed by Picard MarkDuplicates. On a 16-node Hadoop cluster, it is capable of processing about 13 GB per hour in map+rmdup mode, while reaching a throughput of 19 GB per hour in mapping-only mode. AVAILABILITY: SEAL is available online at http://biodoop-seal.sourceforge.net/.

SUBMITTER: Pireddu L 

PROVIDER: S-EPMC3137215 | biostudies-literature | 2011 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

SEAL: a distributed short read mapping and duplicate removal tool.

Pireddu Luca L   Leo Simone S   Zanetti Gianluigi G  

Bioinformatics (Oxford, England) 20110622 15


<h4>Summary</h4>SEAL is a scalable tool for short read pair mapping and duplicate removal. It computes mappings that are consistent with those produced by BWA and removes duplicates according to the same criteria employed by Picard MarkDuplicates. On a 16-node Hadoop cluster, it is capable of processing about 13 GB per hour in map+rmdup mode, while reaching a throughput of 19 GB per hour in mapping-only mode.<h4>Availability</h4>SEAL is available online at http://biodoop-seal.sourceforge.net/. ...[more]

Similar Datasets

| S-EPMC5425171 | biostudies-other
| S-EPMC3035802 | biostudies-other
| S-EPMC5657049 | biostudies-literature
| S-EPMC5846869 | biostudies-other
| S-EPMC4009243 | biostudies-literature
| S-EPMC4991843 | biostudies-literature
| S-EPMC7320608 | biostudies-literature
| S-EPMC4652484 | biostudies-literature
| S-EPMC4147885 | biostudies-literature
| S-EPMC5845352 | biostudies-literature