Unknown

Dataset Information

0

Performance studies on distributed virtual screening.


ABSTRACT: Virtual high-throughput screening (vHTS) is an invaluable method in modern drug discovery. It permits screening large datasets or databases of chemical structures for those structures binding possibly to a drug target. Virtual screening is typically performed by docking code, which often runs sequentially. Processing of huge vHTS datasets can be parallelized by chunking the data because individual docking runs are independent of each other. The goal of this work is to find an optimal splitting maximizing the speedup while considering overhead and available cores on Distributed Computing Infrastructures (DCIs). We have conducted thorough performance studies accounting not only for the runtime of the docking itself, but also for structure preparation. Performance studies were conducted via the workflow-enabled science gateway MoSGrid (Molecular Simulation Grid). As input we used benchmark datasets for protein kinases. Our performance studies show that docking workflows can be made to scale almost linearly up to 500 concurrent processes distributed even over large DCIs, thus accelerating vHTS campaigns significantly.

SUBMITTER: Kruger J 

PROVIDER: S-EPMC4083208 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Performance studies on distributed virtual screening.

Krüger Jens J   Grunzke Richard R   Herres-Pawlis Sonja S   Hoffmann Alexander A   de la Garza Luis L   Kohlbacher Oliver O   Nagel Wolfgang E WE   Gesing Sandra S  

BioMed research international 20140617


Virtual high-throughput screening (vHTS) is an invaluable method in modern drug discovery. It permits screening large datasets or databases of chemical structures for those structures binding possibly to a drug target. Virtual screening is typically performed by docking code, which often runs sequentially. Processing of huge vHTS datasets can be parallelized by chunking the data because individual docking runs are independent of each other. The goal of this work is to find an optimal splitting m  ...[more]

Similar Datasets

| S-EPMC4749411 | biostudies-literature
| S-EPMC4425459 | biostudies-other
| S-EPMC5323360 | biostudies-literature
| S-EPMC5404222 | biostudies-literature
| S-EPMC5942591 | biostudies-literature
| S-EPMC5818911 | biostudies-literature
| S-EPMC4084531 | biostudies-literature
| S-EPMC5872818 | biostudies-literature
| S-EPMC4265523 | biostudies-literature
| S-EPMC3929308 | biostudies-literature