Dataset Information

Integration of the Rosetta suite with the python software stack via reproducible packaging and core programming interfaces for distributed simulation.

ABSTRACT: The Rosetta software suite for macromolecular modeling is a powerful computational toolbox for protein design, structure prediction, and protein structure analysis. The development of novel Rosetta-based scientific tools requires two orthogonal skill sets: deep domain-specific expertise in protein biochemistry and technical expertise in development, deployment, and analysis of molecular simulations. Furthermore, the computational demands of molecular simulation necessitate large scale cluster-based or distributed solutions for nearly all scientifically relevant tasks. To reduce the technical barriers to entry for new development, we integrated Rosetta with modern, widely adopted computational infrastructure. This allows simplified deployment in large-scale cluster and cloud computing environments, and effective reuse of common libraries for simulation execution and data analysis. To achieve this, we integrated Rosetta with the Conda package manager; this simplifies installation into existing computational environments and packaging as docker images for cloud deployment. Then, we developed programming interfaces to integrate Rosetta with the PyData stack for analysis and distributed computing, including the popular tools Jupyter, Pandas, and Dask. We demonstrate the utility of these components by generating a library of a thousand de novo disulfide-rich miniproteins in a hybrid simulation that included cluster-based design and interactive notebook-based analyses. Our new tools enable users, who would otherwise not have access to the necessary computational infrastructure, to perform state-of-the-art molecular simulation and design with Rosetta.

SUBMITTER: Ford AS

PROVIDER: S-EPMC6933847 | biostudies-literature | 2020 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Integration of the Rosetta suite with the python software stack via reproducible packaging and core programming interfaces for distributed simulation.

Ford Alexander S AS Weitzner Brian D BD Bahl Christopher D CD

Protein science : a publication of the Protein Society 20191202 1

The Rosetta software suite for macromolecular modeling is a powerful computational toolbox for protein design, structure prediction, and protein structure analysis. The development of novel Rosetta-based scientific tools requires two orthogonal skill sets: deep domain-specific expertise in protein biochemistry and technical expertise in development, deployment, and analysis of molecular simulations. Furthermore, the computational demands of molecular simulation necessitate large scale cluster-ba ...[more]

PMID: 31495995

Dataset Information

Integration of the Rosetta suite with the python software stack via reproducible packaging and core programming interfaces for distributed simulation.

Publications

Integration of the Rosetta suite with the python software stack via reproducible packaging and core programming interfaces for distributed simulation.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Programming biological models in Python using PySB.
| S-EPMC3588907 | biostudies-other

RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite.
| S-EPMC3123292 | biostudies-literature

Metagenome analysis using the Kraken software suite.
| S-EPMC9725748 | biostudies-literature

Practically useful: what the Rosetta protein modeling suite can do for you.
| S-EPMC2850155 | biostudies-literature

Polyply; a python suite for facilitating simulations of macromolecules and nanomaterials.
| S-EPMC8748707 | biostudies-literature

Improvements to the APBS biomolecular solvation software suite.
| S-EPMC5734301 | biostudies-literature

The CCP4 suite: integrative software for macromolecular crystallography.
| S-EPMC10233625 | biostudies-literature

Language-Agnostic Reproducible Data Analysis Using Literate Programming.
| S-EPMC5053501 | biostudies-literature

The Rustenburg Layered Suite formed as a stack of mush with transient magma chambers.
| S-EPMC7820422 | biostudies-literature

A comprehensive software suite for the analysis of cDNAs.
| S-EPMC5172547 | biostudies-literature