Unknown

Dataset Information

0

PaCBAM: fast and scalable processing of whole exome and targeted sequencing data.


ABSTRACT: BACKGROUND:Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed. RESULTS:PaCBAM is a command line tool written in C and designed for the characterization of genomic regions and single nucleotide positions from whole exome and targeted sequencing data. PaCBAM computes depth of coverage and allele-specific pileup statistics, implements a fast and scalable multi-core computational engine, introduces an innovative and efficient on-the-fly read duplicates filtering strategy and provides comprehensive text output files and visual reports. We demonstrate that PaCBAM exploits parallel computation resources better than existing tools, resulting in important reductions of processing time and memory usage, hence enabling an efficient and fast exploration of large datasets. CONCLUSIONS:PaCBAM is a fast and scalable tool designed to process genomic regions from NGS data files and generate coverage and pileup comprehensive statistics for downstream analysis. The tool can be easily integrated in NGS processing pipelines and is available from Bitbucket and Docker/Singularity hubs.

SUBMITTER: Valentini S 

PROVIDER: S-EPMC6933905 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

PaCBAM: fast and scalable processing of whole exome and targeted sequencing data.

Valentini Samuel S   Fedrizzi Tarcisio T   Demichelis Francesca F   Romanel Alessandro A  

BMC genomics 20191226 1


<h4>Background</h4>Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed.<h4>Results</h4>PaCBAM is a command line tool written in C and designed for the characterization o  ...[more]

Similar Datasets

| S-EPMC7783507 | biostudies-literature
| S-EPMC4253833 | biostudies-other
| S-EPMC4929867 | biostudies-other
| S-EPMC5549930 | biostudies-other
| S-EPMC5818140 | biostudies-literature
| S-EPMC5673918 | biostudies-literature
| S-EPMC3362899 | biostudies-other
| S-EPMC4053953 | biostudies-literature
| S-EPMC7359584 | biostudies-literature
| S-EPMC4528333 | biostudies-literature