Dataset Information

Optimization of C-to-G base editors with sequence context preference predictable by machine learning methods.

ABSTRACT: Efficient and precise base editors (BEs) for C-to-G transversion are highly desirable. However, the sequence context affecting editing outcome largely remains unclear. Here we report engineered C-to-G BEs of high efficiency and fidelity, with the sequence context predictable via machine-learning methods. By changing the species origin and relative position of uracil-DNA glycosylase and deaminase, together with codon optimization, we obtain optimized C-to-G BEs (OPTI-CGBEs) for efficient C-to-G transversion. The motif preference of OPTI-CGBEs for editing 100 endogenous sites is determined in HEK293T cells. Using a sgRNA library comprising 41,388 sequences, we develop a deep-learning model that accurately predicts the OPTI-CGBE editing outcome for targeted sites with specific sequence context. These OPTI-CGBEs are further shown to be capable of efficient base editing in mouse embryos for generating Tyr-edited offspring. Thus, these engineered CGBEs are useful for efficient and precise base editing, with outcome predictable based on sequence context of targeted sites.

SUBMITTER: Yuan T

PROVIDER: S-EPMC8361092 | biostudies-literature | 2021 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Optimization of C-to-G base editors with sequence context preference predictable by machine learning methods.

Yuan Tanglong T Yan Nana N Fei Tianyi T Zheng Jitan J Meng Juan J Li Nana N Liu Jing J Zhang Haihang H Xie Long L Ying Wenqin W Li Di D Shi Lei L Sun Yongsen Y Li Yongyao Y Li Yixue Y Sun Yidi Y Zuo Erwei E

Nature communications 20210812 1

Efficient and precise base editors (BEs) for C-to-G transversion are highly desirable. However, the sequence context affecting editing outcome largely remains unclear. Here we report engineered C-to-G BEs of high efficiency and fidelity, with the sequence context predictable via machine-learning methods. By changing the species origin and relative position of uracil-DNA glycosylase and deaminase, together with codon optimization, we obtain optimized C-to-G BEs (OPTI-CGBEs) for efficient C-to-G t ...[more]

PMID: 34385461

Dataset Information

Optimization of C-to-G base editors with sequence context preference predictable by machine learning methods.

Publications

Optimization of C-to-G base editors with sequence context preference predictable by machine learning methods.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Phage-assisted evolution of highly active cytosine base editors with enhanced selectivity and minimal sequence context preference.
| S-EPMC10894238 | biostudies-literature

Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction.
| S-EPMC6126947 | biostudies-literature

Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning.
| S-EPMC8985520 | biostudies-literature

Functional Optimization of Designer Cardiac Organoids Enabled by Machine Learning Techniques
2024-05-17 | GSE267438 | GEO

Learning What to Want: Context-Sensitive Preference Learning.
| S-EPMC4619741 | biostudies-literature

Machine Learning Methods for Histopathological Image Analysis.
| S-EPMC6158771 | biostudies-other

Systematic optimization of Cas12a base editors in wheat and maize using the ITER platform.
| S-EPMC9838060 | biostudies-literature

Quantifying Biopolymer Sequence Recognition using Biophysically Informed Machine Learning
2021-06-02 | GSE175942 | GEO

Systematic optimization of Cas12a base editors in wheat and maize using the ITER platform
2022-11-21 | GSE200450 | GEO

Developing mitochondrial base editors with diverse context compatibility and high fidelity via saturated spacer library.
| S-EPMC10587121 | biostudies-literature