Project description:Mutation effects prediction is a fundamental challenge in biotechnology and biomedicine. State-of-the-art computational methods have demonstrated the benefits of including semantically rich representations learned from protein sequences, but leave structural constraints out of reach. Here we developed Protein Mutational Effect Predictor (ProMEP), a general and multimodal deep representation learning method that simultaneously learns sequence context and structural constraints from proteins at the scale of evolution. ProMEP markedly outperforms current leading methods and enables accurate zero-shot mutational effects prediction across a variety of deep mutational scanning experiments. The application of ProMEP in the transposon-associated TnpB enzyme engineering task further demonstrates its ability for high-throughput protein space exploration. Without prior knowledge of TnpB, ProMEP accurately identifies multiple mutations that significantly improve the editing efficiency from millions of variants.
Project description:Traditional protein engineering methods, such as directed evolution, while effective, are often slow and labor-intensive. Advances in machine learning and automated biofoundry present new opportunities for optimizing these processes. This study devises a protein language model-enabled automatic evolution platform, a closed-loop system for automated protein engineering within the Design-Build-Test-Learn cycle. The protein language model ESM-2 makes zero-shot prediction of 96 variants to initiate the cycle. The biofoundry constructs and evaluates these variants, and feeds the results back to a multi-layer perceptron to train a fitness predictor, which then makes prediction of second round of 96 variants with improved fitness. With the tRNA synthetase as a model enzyme, four-rounds of evolution carried out within 10 days lead to mutants with enzyme activity improved by up to 2.4-fold. Our system significantly enhances the speed and accuracy of protein evolution, driving faster advancements in protein engineering for industrial applications.
Project description:Transposon (de)repression and heterochromatin reorganization are dynamically regulated during cell fate determination and are hallmarks of cellular senescence. However, whether they are sequence specifically regulated remains unknown. Here, we uncover the KCNQ1OT1 lncRNA, by sequence-specific Hoogsteen base-pairing with double-stranded genomic DNA via its repeat-rich region and binding to heterochromatin protein HP1⍺, guides, induces and maintains epigenetic silencing at specific repetitive DNA elements. Repressing KCNQ1OT1 or deleting its repeat-rich region reduces DNA methylation and H3K9me3 on KCNQ1OT1 targeted transponsons. Engineering a fusion KCNQ1OT1 with an ectopically targeting guiding triplex sequence induces de novo DNA methylation at the target site. Phenotypically, repressing KCNQ1OT1 induces senescence associated heterochromatin foci, transposon activation and retrotransposition, and cellular senescence, demonstrating an essential role of KCNQ1OT1 to safeguard against genome instability and senescence.
Project description:Despite a wealth of molecular knowledge, quantitative laws for accurate prediction of biological phenomena remain rare. Alternative pre-mRNA splicing is an important regulated step in gene expression frequently perturbed in human disease. To understand the combined effects of mutations during evolution, we quantified the effects of all possible combinations of exonic mutations accumulated during the emergence of an alternatively spliced human exon. This revealed that mutation effects scale non-monotonically with the inclusion level of an exon, with each mutation having maximum effect at a predictable intermediate inclusion level. This scaling is observed genome-wide for cis and trans perturbations of splicing, including for natural and disease-associated variants. Mathematical modelling suggests that competition between alternative splice sites is sufficient to cause this non-linearity in the genotype-phenotype map. Combining the global scaling law with specific pairwise interactions between neighbouring mutations allows accurate prediction of the effects of complex genotype changes involving >10 mutations.
Project description:Strain engineering for industrial production requires the improvement of tolerance to multiple unfavorable conditions. Here, we report using global regulator libraries based on the CRISPR-enabled trackable genome engineering (CREATE) method to engineer tolerance against multiple inhibitors in Escherichia coli. Deep mutagenesis libraries were rationally designed, constructed, and screened to target 34,340 mutations across 23 global regulators. A total of 69 specific mutations that respectively conferred tolerance to acetate, NaCl, furfural, and high temperature were isolated, confirmed, and evaluated. Among them, 32 novel reconstructed mutations exhibited better tolerance to the corresponding inhibitors, and the most dramatic mutation CRP-E182D conferred high cross-tolerance to acetate, NaCl and isobutanol. To further investigate the effects of this mutation in the CRP on acetate tolerance, whole-transcriptome sequencing (RNA-Seq) analysis was performed.
Project description:We identified inactivating mutations in NEK10, a poorly characterized human protein kinase, in a novel human bronchiectasis syndrome. In order to understand effects of loss of its function on the airway phosphoproteome, NEK10 was CRISPR/Cas9 targeted in human airway air-liquid interface (ALI) cultures with 2 independent guides and assayed by iron-enrichment phosphoproteomics.
Project description:We combine two experimental high-throughput sequencing methods to identify new 2'-O-methylation sites in human and assign snoRNA guides to sites with previously unknown guides.