Dataset Information

Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours.

ABSTRACT: An ongoing challenge in protein chemistry is to identify the underlying interaction energies that capture protein dynamics. The traditional trade-off in biomolecular simulation between accuracy and computational efficiency is predicated on the assumption that detailed force fields are typically well-parameterized, obtaining a significant fraction of possible accuracy. We re-examine this trade-off in the more realistic regime in which parameterization is a greater source of error than the level of detail in the force field. To address parameterization of coarse-grained force fields, we use the contrastive divergence technique from machine learning to train from simulations of 450 proteins. In our procedure, the computational efficiency of the model enables high accuracy through the precise tuning of the Boltzmann ensemble. This method is applied to our recently developed Upside model, where the free energy for side chains is rapidly calculated at every time-step, allowing for a smooth energy landscape without steric rattling of the side chains. After this contrastive divergence training, the model is able to de novo fold proteins up to 100 residues on a single core in days. This improved Upside model provides a starting point both for investigation of folding dynamics and as an inexpensive Bayesian prior for protein physics that can be integrated with additional experimental or bioinformatic data.

SUBMITTER: Jumper JM

PROVIDER: S-EPMC6307714 | biostudies-literature | 2018 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours.

Jumper John M JM Faruk Nabil F NF Freed Karl F KF Sosnick Tobin R TR

PLoS computational biology 20181227 12

An ongoing challenge in protein chemistry is to identify the underlying interaction energies that capture protein dynamics. The traditional trade-off in biomolecular simulation between accuracy and computational efficiency is predicated on the assumption that detailed force fields are typically well-parameterized, obtaining a significant fraction of possible accuracy. We re-examine this trade-off in the more realistic regime in which parameterization is a greater source of error than the level o ...[more]

PMID: 30589834

Similar Datasets

Project description:Earliest events in the aggregation process, such as single molecule reconfiguration, are extremely important and the most difficult to characterize in experiments. To this end, we have used well-tempered bias exchange metadynamics simulations to determine the equilibrium ensembles of an insulin molecule under amyloidogenic conditions of low pH and high temperature. A bin-based clustering method that uses statistics accumulated in bias exchange metadynamics trajectories was employed to construct a detailed thermodynamic and kinetic model of insulin folding. The highest lifetime, lowest free-energy ensemble identified consisted of native conformations adopted by a folded insulin monomer in solution, namely, the R-, the Rf-, and the T-states of insulin. The lowest free-energy structure had a root mean square deviation of only 0.15 nm from native x-ray structure. The second longest-lived metastable state was an unfolded, compact monomer with little similarity to the native structure. We have identified three additional long-lived, metastable states from the bin-based model. We then carried out an exhaustive structural characterization of metastable states on the basis of tertiary contact maps and per-residue accessible surface areas. We have also determined the lowest free-energy path between two longest-lived metastable states and confirm earlier findings of non-two-state folding for insulin through a folding intermediate. The ensemble containing the monomeric intermediate retained 58% of native hydrophobic contacts, however, accompanied by a complete loss of native secondary structure. We have discussed the relative importance of nativelike versus nonnative tertiary contacts for the folding transition. We also provide a simple measure to determine the importance of an individual residue for folding transition. Finally, we have compared and contrasted this intermediate with experimental data obtained in spectroscopic, crystallographic, and calorimetric measurements during early stages of insulin aggregation. We have also determined stability of monomeric insulin by incubation at a very low concentration to isolate protein-protein interaction effects.

Dataset Information

Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours.

Publications

Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets