Unknown

Dataset Information

0

SIMPLEX: cloud-enabled pipeline for the comprehensive analysis of exome sequencing data.


ABSTRACT: In recent studies, exome sequencing has proven to be a successful screening tool for the identification of candidate genes causing rare genetic diseases. Although underlying targeted sequencing methods are well established, necessary data handling and focused, structured analysis still remain demanding tasks. Here, we present a cloud-enabled autonomous analysis pipeline, which comprises the complete exome analysis workflow. The pipeline combines several in-house developed and published applications to perform the following steps: (a) initial quality control, (b) intelligent data filtering and pre-processing, (c) sequence alignment to a reference genome, (d) SNP and DIP detection, (e) functional annotation of variants using different approaches, and (f) detailed report generation during various stages of the workflow. The pipeline connects the selected analysis steps, exposes all available parameters for customized usage, performs required data handling, and distributes computationally expensive tasks either on a dedicated high-performance computing infrastructure or on the Amazon cloud environment (EC2). The presented application has already been used in several research projects including studies to elucidate the role of rare genetic diseases. The pipeline is continuously tested and is publicly available under the GPL as a VirtualBox or Cloud image at http://simplex.i-med.ac.at; additional supplementary data is provided at http://www.icbi.at/exome.

SUBMITTER: Fischer M 

PROVIDER: S-EPMC3411592 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

altmetric image

Publications

SIMPLEX: cloud-enabled pipeline for the comprehensive analysis of exome sequencing data.

Fischer Maria M   Snajder Rene R   Pabinger Stephan S   Dander Andreas A   Schossig Anna A   Zschocke Johannes J   Trajanoski Zlatko Z   Stocker Gernot G  

PloS one 20120801 8


In recent studies, exome sequencing has proven to be a successful screening tool for the identification of candidate genes causing rare genetic diseases. Although underlying targeted sequencing methods are well established, necessary data handling and focused, structured analysis still remain demanding tasks. Here, we present a cloud-enabled autonomous analysis pipeline, which comprises the complete exome analysis workflow. The pipeline combines several in-house developed and published applicati  ...[more]

Similar Datasets

| S-EPMC5408420 | biostudies-literature
| S-EPMC4585643 | biostudies-literature
| S-EPMC4070549 | biostudies-literature
| S-EPMC3738164 | biostudies-literature
| S-EPMC5753366 | biostudies-literature
| S-EPMC4376134 | biostudies-literature
| S-EPMC8294687 | biostudies-literature
| S-EPMC8323418 | biostudies-literature
| S-EPMC4103589 | biostudies-literature
| S-EPMC3253117 | biostudies-literature