Unknown

Dataset Information

0

Unified rational protein engineering with sequence-based deep representation learning.


ABSTRACT: Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabeled amino-acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approach predicts the stability of natural and de novo designed proteins, and the quantitative function of molecularly diverse mutants, competitively with the state-of-the-art methods. UniRep further enables two orders of magnitude efficiency improvement in a protein engineering task. UniRep is a versatile summary of fundamental protein features that can be applied across protein engineering informatics.

SUBMITTER: Alley EC 

PROVIDER: S-EPMC7067682 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Unified rational protein engineering with sequence-based deep representation learning.

Alley Ethan C EC   Khimulya Grigory G   Biswas Surojit S   AlQuraishi Mohammed M   Church George M GM   Church George M GM  

Nature methods 20191021 12


Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabeled amino-acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approa  ...[more]

Similar Datasets

| S-EPMC6129267 | biostudies-literature
| S-EPMC9997061 | biostudies-literature
2024-02-03 | GSE254493 | GEO
| S-EPMC5445391 | biostudies-literature
| S-EPMC7059401 | biostudies-literature
| S-EPMC8790617 | biostudies-literature
| S-EPMC6355112 | biostudies-literature
| S-EPMC9343202 | biostudies-literature
| S-EPMC7138789 | biostudies-literature
| S-EPMC10868333 | biostudies-literature