Unknown

Dataset Information

0

Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model.


ABSTRACT:

Motivation

In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens of thousands of discovered circRNAs. Under the hypothesis that human circRNAs can be categorized according to different biogenesis mechanisms, we designed a contextual regression model trained to predict the formation of circular RNA from a random genomic locus on human genome, with potential biogenesis factors of circular RNA as the features of the training data.

Results

After achieving high prediction accuracy, we found through the feature extraction technique that the examined human circRNAs can be categorized into seven subgroups, according to the presence of the following sequence features: RNA editing sites, simple repeat sequences, self-chains, RNA binding protein binding sites and CpG islands within the flanking regions of the circular RNA back-spliced junction sites. These results support all of the previously reported biogenesis mechanisms of circRNA and solidify the idea that multiple biogenesis mechanisms co-exist for different subset of human circRNAs. Furthermore, we uncover a potential new links between circRNA biogenesis and flanking CpG island. We have also identified RNA binding proteins putatively correlated with circRNA biogenesis.

Availability and implementation

Scripts and tutorial are available at http://wanglab.ucsd.edu/star/circRNA. This program is under GNU General Public License v3.0.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Liu C 

PROVIDER: S-EPMC6901070 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model.

Liu Chengyu C   Liu Yu-Chen YC   Huang Hsien-Da HD   Wang Wei W  

Bioinformatics (Oxford, England) 20191201 23


<h4>Motivation</h4>In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens of thousands of discovered circRNAs. Under the hypothesis that human circRNAs can be categorized according to different biogenesis mechanisms, we designed a contextual regression model trained  ...[more]

Similar Datasets

| S-EPMC10502833 | biostudies-literature
| S-EPMC4898734 | biostudies-literature
| S-EPMC4479058 | biostudies-literature
| S-EPMC7396586 | biostudies-literature
| S-EPMC7064494 | biostudies-literature
| S-EPMC9601423 | biostudies-literature
| S-EPMC7038676 | biostudies-literature
| S-EPMC8505021 | biostudies-literature
| S-EPMC7374320 | biostudies-literature
| S-EPMC11082072 | biostudies-literature