Unknown

Dataset Information

0

EvoEF2: accurate and fast energy function for computational protein design.


ABSTRACT:

Motivation

The accuracy and success rate of de novo protein design remain limited, mainly due to the parameter over-fitting of current energy functions and their inability to discriminate incorrect designs from correct designs.

Results

We developed an extended energy function, EvoEF2, for efficient de novo protein sequence design, based on a previously proposed physical energy function, EvoEF. Remarkably, EvoEF2 recovered 32.5%, 47.9% and 22.3% of all, core and surface residues for 148 test monomers, and was generally applicable to protein-protein interaction design, as it recapitulated 30.9%, 42.4%, 31.3% and 21.4% of all, core, interface and surface residues for 88 test dimers, significantly outperforming EvoEF on the native sequence recapitulation. We further used I-TASSER to evaluate the foldability of the 148 designed monomer sequences, where all of them were predicted to fold into structures with high fold- and atomic-level similarity to their corresponding native structures, as demonstrated by the fact that 87.8% of the predicted structures shared a root-mean-square-deviation less than 2?Å to their native counterparts. The study also demonstrated that the usefulness of physical energy functions is highly correlated with the parameter optimization processes, and EvoEF2, with parameters optimized using sequence recapitulation, is more suitable for computational protein sequence design than EvoEF, which was optimized on thermodynamic mutation data.

Availability and implementation

The source code of EvoEF2 and the benchmark datasets are freely available at https://zhanglab.ccmb.med.umich.edu/EvoEF.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Huang X 

PROVIDER: S-EPMC7144094 | biostudies-literature | 2020 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

EvoEF2: accurate and fast energy function for computational protein design.

Huang Xiaoqiang X   Pearce Robin R   Zhang Yang Y  

Bioinformatics (Oxford, England) 20200201 4


<h4>Motivation</h4>The accuracy and success rate of de novo protein design remain limited, mainly due to the parameter over-fitting of current energy functions and their inability to discriminate incorrect designs from correct designs.<h4>Results</h4>We developed an extended energy function, EvoEF2, for efficient de novo protein sequence design, based on a previously proposed physical energy function, EvoEF. Remarkably, EvoEF2 recovered 32.5%, 47.9% and 22.3% of all, core and surface residues fo  ...[more]

Similar Datasets

| S-EPMC2578799 | biostudies-literature
| S-EPMC4828276 | biostudies-literature
| S-EPMC8262746 | biostudies-literature
| S-EPMC7328376 | biostudies-literature
| S-EPMC2280065 | biostudies-literature
| S-EPMC6736313 | biostudies-literature
| S-EPMC8016488 | biostudies-literature
| S-EPMC3919130 | biostudies-literature
| S-EPMC3187653 | biostudies-literature
| S-EPMC3170394 | biostudies-other