Unknown

Dataset Information

0

Identifying Circular RNA and Predicting Its Regulatory Interactions by Machine Learning.


ABSTRACT: Circular RNA (circRNA) is a closed long non-coding RNA (lncRNA) formed by covalently closed loops through back-splicing. Emerging evidence indicates that circRNA can influence cellular physiology through various molecular mechanisms. Thus, accurate circRNA identification and prediction of its regulatory information are critical for understanding its biogenesis. Although several computational tools based on machine learning have been proposed for circRNA identification, the prediction accuracy remains to be improved. Here, first we present circLGB, a machine learning-based framework to discriminate circRNA from other lncRNAs. circLGB integrates commonly used sequence-derived features and three new features containing adenosine to inosine (A-to-I) deamination, A-to-I density and the internal ribosome entry site. circLGB categorizes circRNAs by utilizing a LightGBM classifier with feature selection. Second, we introduce circMRT, an ensemble machine learning framework to systematically predict the regulatory information for circRNA, including their interactions with microRNA, the RNA binding protein, and transcriptional regulation. Feature sets including sequence-based features, graph features, genome context, and regulatory information features were modeled in circMRT. Experiments on public and our constructed datasets show that the proposed algorithms outperform the available state-of-the-art methods. circLGB is available at http://www.circlgb.com. Source codes are available at https://github.com/Peppags/circLGB-circMRT.

SUBMITTER: Zhang G 

PROVIDER: S-EPMC7396586 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

Identifying Circular RNA and Predicting Its Regulatory Interactions by Machine Learning.

Zhang Guishan G   Deng Yiyun Y   Liu Qingyu Q   Ye Bingxu B   Dai Zhiming Z   Chen Yaowen Y   Dai Xianhua X  

Frontiers in genetics 20200721


Circular RNA (circRNA) is a closed long non-coding RNA (lncRNA) formed by covalently closed loops through back-splicing. Emerging evidence indicates that circRNA can influence cellular physiology through various molecular mechanisms. Thus, accurate circRNA identification and prediction of its regulatory information are critical for understanding its biogenesis. Although several computational tools based on machine learning have been proposed for circRNA identification, the prediction accuracy re  ...[more]

Similar Datasets

2023-06-01 | GSE193400 | GEO
| S-EPMC8268592 | biostudies-literature
| PRJNA796028 | ENA
| S-EPMC10805179 | biostudies-literature
| S-EPMC8413337 | biostudies-literature
2021-07-09 | GSE163896 | GEO
| S-EPMC9411552 | biostudies-literature
| S-EPMC2217580 | biostudies-literature
| S-EPMC8578642 | biostudies-literature
| S-EPMC9038712 | biostudies-literature