Unknown

Dataset Information

0

Towards agile large-scale predictive modelling in drug discovery with flow-based programming design principles.


ABSTRACT: Predictive modelling in drug discovery is challenging to automate as it often contains multiple analysis steps and might involve cross-validation and parameter tuning that create complex dependencies between tasks. With large-scale data or when using computationally demanding modelling methods, e-infrastructures such as high-performance or cloud computing are required, adding to the existing challenges of fault-tolerant automation. Workflow management systems can aid in many of these challenges, but the currently available systems are lacking in the functionality needed to enable agile and flexible predictive modelling. We here present an approach inspired by elements of the flow-based programming paradigm, implemented as an extension of the Luigi system which we name SciLuigi. We also discuss the experiences from using the approach when modelling a large set of biochemical interactions using a shared computer cluster.Graphical abstract.

SUBMITTER: Lampa S 

PROVIDER: S-EPMC5123367 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

Towards agile large-scale predictive modelling in drug discovery with flow-based programming design principles.

Lampa Samuel S   Alvarsson Jonathan J   Spjuth Ola O  

Journal of cheminformatics 20161124


Predictive modelling in drug discovery is challenging to automate as it often contains multiple analysis steps and might involve cross-validation and parameter tuning that create complex dependencies between tasks. With large-scale data or when using computationally demanding modelling methods, e-infrastructures such as high-performance or cloud computing are required, adding to the existing challenges of fault-tolerant automation. Workflow management systems can aid in many of these challenges,  ...[more]

Similar Datasets

| S-EPMC2442223 | biostudies-literature
| S-EPMC4590513 | biostudies-literature
2011-12-04 | E-GEOD-32587 | biostudies-arrayexpress
2011-12-04 | GSE32587 | GEO
| S-EPMC7779114 | biostudies-literature
| S-EPMC6870376 | biostudies-literature
2021-07-22 | PXD019583 | Pride
| S-EPMC3175644 | biostudies-literature
| S-EPMC8475644 | biostudies-literature
2021-05-05 | GSE155490 | GEO