Unknown

Dataset Information

0

Initial steps towards a production platform for DNA sequence analysis on the grid.


ABSTRACT:

Background

Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users.

Results

In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures.

Conclusions

The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/

SUBMITTER: Luyf AC 

PROVIDER: S-EPMC3018473 | biostudies-literature | 2010 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Initial steps towards a production platform for DNA sequence analysis on the grid.

Luyf Angela C M AC   van Schaik Barbera D C BD   de Vries Michel M   Baas Frank F   van Kampen Antoine H C AH   Olabarriaga Silvia D SD  

BMC bioinformatics 20101214


<h4>Background</h4>Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users.<h4>Results</h4>In this study we reused a platform that was  ...[more]

Similar Datasets

| S-EPMC9610979 | biostudies-literature
| S-EPMC4589982 | biostudies-literature
| S-EPMC7146564 | biostudies-literature
| S-EPMC9642037 | biostudies-literature
| S-EPMC3782456 | biostudies-other
| 2416973 | ecrin-mdr-crc
| S-EPMC6304159 | biostudies-literature
| S-EPMC4191396 | biostudies-literature
| S-EPMC1919486 | biostudies-literature
| S-EPMC3409857 | biostudies-literature