Unknown

Dataset Information

0

High-throughput bioinformatics with the Cyrille2 pipeline system.


ABSTRACT:

Background

Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible.

Results

We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1) a web based, graphical user interface (GUI) that enables a pipeline operator to manage the system; 2) the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3) the Executor, which searches for scheduled jobs and executes these on a compute cluster.

Conclusion

The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.

SUBMITTER: Fiers MW 

PROVIDER: S-EPMC2268656 | biostudies-literature | 2008 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

High-throughput bioinformatics with the Cyrille2 pipeline system.

Fiers Mark W E J MW   van der Burgt Ate A   Datema Erwin E   de Groot Joost C W JC   van Ham Roeland C H J RC  

BMC bioinformatics 20080212


<h4>Background</h4>Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipelin  ...[more]

Similar Datasets

| S-EPMC1626092 | biostudies-literature
2020-06-12 | PXD017450 | Pride
| S-EPMC6532147 | biostudies-literature
2020-07-31 | MSV000085861 | MassIVE
| S-ECPF-GEOD-40617 | biostudies-other
2018-07-26 | PXD010510 | Pride
| S-EPMC6793853 | biostudies-literature
| S-EPMC4359755 | biostudies-literature
| S-EPMC8636496 | biostudies-literature
2013-12-11 | E-GEOD-40617 | biostudies-arrayexpress