Proteomics

Dataset Information

0

Integrating Protein Language Models and Automatic Biofoundry for Enhanced Protein Evolution


ABSTRACT: Traditional protein engineering methods, such as directed evolution, while effective, are often slow and labor-intensive. Advances in machine learning and automated biofoundry present new opportunities for optimizing these processes. This study devises a protein language model-enabled automatic evolution platform, a closed-loop system for automated protein engineering within the Design-Build-Test-Learn cycle. The protein language model ESM-2 makes zero-shot prediction of 96 variants to initiate the cycle. The biofoundry constructs and evaluates these variants, and feeds the results back to a multi-layer perceptron to train a fitness predictor, which then makes prediction of second round of 96 variants with improved fitness. With the tRNA synthetase as a model enzyme, four-rounds of evolution carried out within 10 days lead to mutants with enzyme activity improved by up to 2.4-fold. Our system significantly enhances the speed and accuracy of protein evolution, driving faster advancements in protein engineering for industrial applications.

INSTRUMENT(S): 6520A Quadrupole Time-of-Flight LC/MS

ORGANISM(S): Escherichia Coli

SUBMITTER: Haoran Yu  

LAB HEAD: Haoran Yu

PROVIDER: PXD058768 | Pride | 2024-12-25

REPOSITORIES: Pride

Dataset's files

Source:
Action DRS
10-ncAAs.zip Other
NCAA1.xml Xml
checksum.txt Txt
ncAA10.xml Xml
ncAA2.xml Xml
Items per page:
1 - 5 of 13

Similar Datasets

2017-10-17 | GSE89746 | GEO
2021-06-30 | E-MTAB-10695 | biostudies-arrayexpress
2023-10-03 | PXD022732 | Pride
2024-03-13 | GSE261254 | GEO
2022-12-01 | ST002431 | MetabolomicsWorkbench
2009-11-12 | GSE18142 | GEO
2024-04-27 | GSE265942 | GEO
2015-04-15 | E-GEOD-67871 | biostudies-arrayexpress
| PRJNA1080297 | ENA
2017-11-15 | GSE102797 | GEO