Project description:Protein design is a powerful tool for elucidating mechanisms of function and engineering new therapeutics and nanotechnologies. Although soluble protein design has advanced, membrane protein design remains challenging because of difficulties in modeling the lipid bilayer. In this work, we developed an implicit approach that captures the anisotropic structure, shape of water-filled pores, and nanoscale dimensions of membranes with different lipid compositions. The model improves performance in computational benchmarks against experimental targets, including prediction of protein orientations in the bilayer, ΔΔG calculations, native structure discrimination, and native sequence recovery. When applied to de novo protein design, this approach designs sequences with an amino acid distribution near the native amino acid distribution in membrane proteins, overcoming a critical flaw in previous membrane models that were prone to generating leucine-rich designs. Furthermore, the proteins designed in the new membrane model exhibit native-like features including interfacial aromatic side chains, hydrophobic lengths compatible with bilayer thickness, and polar pores. Our method advances high-resolution membrane protein structure prediction and design toward tackling key biological questions and engineering challenges.
Project description:Solvent molecules interact intimately with proteins and can profoundly regulate their structure and function. However, accurately and efficiently modeling protein solvation effects at the molecular level has been challenging. Here, we present a method that improves the atomic-level modeling of soluble and membrane protein structures and binding by efficiently predicting de novo protein-solvent molecule interactions. The method predicted with unprecedented accuracy buried water molecule positions, solvated protein conformations, and challenging mutational effects on protein binding. When applied to homology modeling, solvent-bound membrane protein structures, pockets, and cavities were recapitulated with near-atomic precision even from distant homologs. Blindly refined atomic-level structures of evolutionary distant G protein-coupled receptors imply strikingly different functional roles of buried solvent between receptor classes. The method should prove useful for refining low-resolution protein structures, accurately modeling drug-binding sites in structurally uncharacterized receptors, and designing solvent-mediated protein catalysis, recognition, ligand binding, and membrane protein signaling.
Project description:The prediction of protein three-dimensional structure from amino acid sequence has been a grand challenge problem in computational biophysics for decades, owing to its intrinsic scientific interest and also to the many potential applications for robust protein structure prediction algorithms, from genome interpretation to protein function prediction. More recently, the inverse problem - designing an amino acid sequence that will fold into a specified three-dimensional structure - has attracted growing attention as a potential route to the rational engineering of proteins with functions useful in biotechnology and medicine. Methods for the prediction and design of protein structures have advanced dramatically in the past decade. Increases in computing power and the rapid growth in protein sequence and structure databases have fuelled the development of new data-intensive and computationally demanding approaches for structure prediction. New algorithms for designing protein folds and protein-protein interfaces have been used to engineer novel high-order assemblies and to design from scratch fluorescent proteins with novel or enhanced properties, as well as signalling proteins with therapeutic potential. In this Review, we describe current approaches for protein structure prediction and design and highlight a selection of the successful applications they have enabled.
Project description:We describe the adaptation of the Rosetta de novo structure prediction method for prediction of helical transmembrane protein structures. The membrane environment is modeled by embedding the protein chain into a model membrane represented by parallel planes defining hydrophobic, interface, and polar membrane layers for each energy evaluation. The optimal embedding is determined by maximizing the exposure of surface hydrophobic residues within the membrane and minimizing hydrophobic exposure outside of the membrane. Protein conformations are built up using the Rosetta fragment assembly method and evaluated using a new membrane-specific version of the Rosetta low-resolution energy function in which residue-residue and residue-environment interactions are functions of the membrane layer in addition to amino acid identity, distance, and density. We find that lower energy and more native-like structures are achieved by sequential addition of helices to a growing chain, which may mimic some aspects of helical protein biogenesis after translocation, rather than folding the whole chain simultaneously as in the Rosetta soluble protein prediction method. In tests on 12 membrane proteins for which the structure is known, between 51 and 145 residues were predicted with root-mean-square deviation <4 A from the native structure.
Project description:Four implicit membrane models [IMM1, generalized Born (GB)-surface area-implicit membrane (GBSAIM), GB with a simple switching (GBSW), and heterogeneous dielectric GB (HDGB)] were tested for their ability to discriminate the native conformation of five membrane proteins from 450 decoys generated by the Rosetta-Membrane program. The energy ranking of the native state and Z-scores were used to assess the performance of the models. The effect of membrane thickness was examined and was found to be substantial. Quite satisfactory discrimination was achieved with the all-atom IMM1 and GBSW models at 25.4 Å thickness and with the HDGB model at 28.5 Å thickness. The energy components by themselves were not discriminative. Both van der Waals and electrostatic interactions contributed to native state discrimination, to a different extent in each model. Computational efficiency of the models decreased in the order: extended-atom IMM1 > all-atom IMM1 > GBSAIM > GBSW > HDGB. These results encourage the further development and use of implicit membrane models for membrane protein structure prediction.
Project description:The highly anisotropic environment of the lipid bilayer membrane imposes significant constraints on the structures and functions of membrane proteins. However, NMR structure calculations typically use a simple repulsive potential that neglects the effects of solvation and electrostatics, because explicit atomic representation of the solvent and lipid molecules is computationally expensive and impractical for routine NMR-restrained calculations that start from completely extended polypeptide templates. Here, we describe the extension of a previously described implicit solvation potential, eefxPot, to include a membrane model for NMR-restrained calculations of membrane protein structures in XPLOR-NIH. The key components of eefxPot are an energy term for solvation free energy that works together with other nonbonded energy functions, a dedicated force field for conformational and nonbonded protein interaction parameters, and a membrane function that modulates the solvation free energy and dielectric screening as a function of the atomic distance from the membrane center, relative to the membrane thickness. Initial results obtained for membrane proteins with structures determined experimentally in lipid bilayer membranes show that eefxPot affords significant improvements in structural quality, accuracy, and precision. Calculations with eefxPot are straightforward to implement and can be used to both fold and refine structures, as well as to run unrestrained molecular-dynamics simulations. The potential is entirely compatible with the full range of experimental restraints measured by various techniques. Overall, it provides a useful and practical way to calculate membrane protein structures in a physically realistic environment.
Project description:Asymmetric multiprotein complexes that undergo subunit exchange play central roles in biology but present a challenge for design because the components must not only contain interfaces that enable reversible association but also be stable and well behaved in isolation. We use implicit negative design to generate β sheet-mediated heterodimers that can be assembled into a wide variety of complexes. The designs are stable, folded, and soluble in isolation and rapidly assemble upon mixing, and crystal structures are close to the computational models. We construct linearly arranged hetero-oligomers with up to six different components, branched hetero-oligomers, closed C4-symmetric two-component rings, and hetero-oligomers assembled on a cyclic homo-oligomeric central hub and demonstrate that such complexes can readily reconfigure through subunit exchange. Our approach provides a general route to designing asymmetric reconfigurable protein systems.
Project description:Ordered two-dimensional arrays such as S-layers1,2 and designed analogues3-5 have intrigued bioengineers6,7, but with the exception of a single lattice formed with flexible linkers8, they are constituted from just one protein component. Materials composed of two components have considerable potential advantages for modulating assembly dynamics and incorporating more complex functionality9-12. Here we describe a computational method to generate co-assembling binary layers by designing rigid interfaces between pairs of dihedral protein building blocks, and use it to design a p6m lattice. The designed array components are soluble at millimolar concentrations, but when combined at nanomolar concentrations, they rapidly assemble into nearly crystalline micrometre-scale arrays nearly identical to the computational design model in vitro and in cells without the need for a two-dimensional support. Because the material is designed from the ground up, the components can be readily functionalized and their symmetry reconfigured, enabling formation of ligand arrays with distinguishable surfaces, which we demonstrate can drive extensive receptor clustering, downstream protein recruitment and signalling. Using atomic force microscopy on supported bilayers and quantitative microscopy on living cells, we show that arrays assembled on membranes have component stoichiometry and structure similar to arrays formed in vitro, and that our material can therefore impose order onto fundamentally disordered substrates such as cell membranes. In contrast to previously characterized cell surface receptor binding assemblies such as antibodies and nanocages, which are rapidly endocytosed, we find that large arrays assembled at the cell surface suppress endocytosis in a tunable manner, with potential therapeutic relevance for extending receptor engagement and immune evasion. Our work provides a foundation for a synthetic cell biology in which multi-protein macroscale materials are designed to modulate cell responses and reshape synthetic and living systems.
Project description:The computational design of peptide binders towards a specific protein interface can aid diagnostic and therapeutic efforts. Here, we design peptide binders by combining the known structural space searched with Foldseek, the protein design method ESM-IF1, and AlphaFold2 (AF) in a joint framework. Foldseek generates backbone seeds for a modified version of ESM-IF1 adapted to protein complexes. The resulting sequences are evaluated with AF using an MSA representation for the receptor structure and a single sequence for the binder. We show that AF can accurately evaluate protein binders and that our bind score can select these (ROC AUC = 0.96 for the heterodimeric case). We find that designs created from seeds with more contacts per residue are more successful and tend to be short. There is a relationship between the sequence recovery in interface positions and the plDDT of the designs, where designs with ≥80% recovery have an average plDDT of 84 compared to 55 at 0%. Designed sequences have 60% higher median plDDT values towards intended receptors than non-intended ones. Successful binders (predicted interface RMSD ≤ 2 Å) are designed towards 185 (6.5%) heteromeric and 42 (3.6%) homomeric protein interfaces with ESM-IF1 compared with 18 (1.5%) using ProteinMPNN from 100 samples.
Project description:Lipid membrane permeation of drug molecules was investigated with Heterogeneous Dielectric Generalized Born (HDGB)-based models using solubility-diffusion theory and machine learning. Free energy profiles were obtained for neutral molecules by the standard HDGB and Dynamic HDGB (DHDGB) to account for the membrane deformation upon insertion of drugs. We also obtained hybrid free energy profiles where the neutralization of charged molecules was taken into account upon membrane insertion. The evaluation of the predictions was done against experimental permeability coefficients from Parallel Artificial Membrane Permeability Assays (PAMPA), and effects of partial charge sets, CGenFF, AM1-BCC, and OPLS, on the performance of the predictions were discussed. (D)HDGB-based models improved the predictions over the two-state implicit membrane models, and partial charge sets seemed to have a strong impact on the predictions. Machine learning increased the accuracy of the predictions, although it could not outperform the physics-based approach in terms of correlations.