Unknown

Dataset Information

0

An Interpretable Prediction Model for Identifying N7-Methylguanosine Sites Based on XGBoost and SHAP.


ABSTRACT: Recent studies have increasingly shown that the chemical modification of mRNA plays an important role in the regulation of gene expression. N7-methylguanosine (m7G) is a type of positively-charged mRNA modification that plays an essential role for efficient gene expression and cell viability. However, the research on m7G has received little attention to date. Bioinformatics tools can be applied as auxiliary methods to identify m7G sites in transcriptomes. In this study, we develop a novel interpretable machine learning-based approach termed XG-m7G for the differentiation of m7G sites using the XGBoost algorithm and six different types of sequence-encoding schemes. Both 10-fold and jackknife cross-validation tests indicate that XG-m7G outperforms iRNA-m7G. Moreover, using the powerful SHAP algorithm, this new framework also provides desirable interpretations of the model performance and highlights the most important features for identifying m7G sites. XG-m7G is anticipated to serve as a useful tool and guide for researchers in their future studies of mRNA modification sites.

SUBMITTER: Bi Y 

PROVIDER: S-EPMC7533297 | biostudies-literature | 2020 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

An Interpretable Prediction Model for Identifying N<sup>7</sup>-Methylguanosine Sites Based on XGBoost and SHAP.

Bi Yue Y   Xiang Dongxu D   Ge Zongyuan Z   Li Fuyi F   Jia Cangzhi C   Song Jiangning J  

Molecular therapy. Nucleic acids 20200825


Recent studies have increasingly shown that the chemical modification of mRNA plays an important role in the regulation of gene expression. N<sup>7</sup>-methylguanosine (m7G) is a type of positively-charged mRNA modification that plays an essential role for efficient gene expression and cell viability. However, the research on m7G has received little attention to date. Bioinformatics tools can be applied as auxiliary methods to identify m7G sites in transcriptomes. In this study, we develop a n  ...[more]

Similar Datasets

| S-EPMC6664791 | biostudies-literature
| S-EPMC9541006 | biostudies-literature
| S-EPMC10216550 | biostudies-literature
| S-EPMC10435947 | biostudies-literature
| S-EPMC7992861 | biostudies-literature
| S-EPMC10120187 | biostudies-literature
| S-EPMC9523042 | biostudies-literature
| S-EPMC6861218 | biostudies-literature
| S-EPMC10249100 | biostudies-literature
| S-EPMC7900290 | biostudies-literature