Unknown

Dataset Information

0

Deep neural networks for inferring binding sites of RNA-binding proteins by using distributed representations of RNA primary sequence and secondary structure.


ABSTRACT:

Background

RNA binding proteins (RBPs) play a vital role in post-transcriptional processes in all eukaryotes, such as splicing regulation, mRNA transport, and modulation of mRNA translation and decay. The identification of RBP binding sites is a crucial step in understanding the biological mechanism of post-transcriptional gene regulation. However, the determination of RBP binding sites on a large scale is a challenging task due to high cost of biochemical assays. Quite a number of studies have exploited machine learning methods to predict binding sites. Especially, deep learning is increasingly used in the bioinformatics field by virtue of its ability to learn generalized representations from DNA and protein sequences.

Results

In this paper, we implemented a novel deep neural network model, DeepRKE, which combines primary RNA sequence and secondary structure information to effectively predict RBP binding sites. Specifically, we used word embedding algorithm to extract features of RNA sequences and secondary structures, i.e., distributed representation of k-mers sequence rather than traditional one-hot encoding. The distributed representations are taken as input of convolutional neural networks (CNN) and bidirectional long-term short-term memory networks (BiLSTM) to identify RBP binding sites. Our results show that deepRKE outperforms existing counterpart methods on two large-scale benchmark datasets.

Conclusions

Our extensive experimental results show that DeepRKE is an efficacious tool for predicting RBP binding sites. The distributed representations of RNA sequences and secondary structures can effectively detect the latent relationship and similarity between k-mers, and thus improve the predictive performance. The source code of DeepRKE is available at https://github.com/youzhiliu/DeepRKE/ .

SUBMITTER: Deng L 

PROVIDER: S-EPMC7745412 | biostudies-literature | 2020 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Deep neural networks for inferring binding sites of RNA-binding proteins by using distributed representations of RNA primary sequence and secondary structure.

Deng Lei L   Liu Youzhi Y   Shi Yechuan Y   Zhang Wenhao W   Yang Chun C   Liu Hui H  

BMC genomics 20201217 Suppl 13


<h4>Background</h4>RNA binding proteins (RBPs) play a vital role in post-transcriptional processes in all eukaryotes, such as splicing regulation, mRNA transport, and modulation of mRNA translation and decay. The identification of RBP binding sites is a crucial step in understanding the biological mechanism of post-transcriptional gene regulation. However, the determination of RBP binding sites on a large scale is a challenging task due to high cost of biochemical assays. Quite a number of studi  ...[more]

Similar Datasets

| S-EPMC7050519 | biostudies-literature
| S-EPMC5902551 | biostudies-literature
| S-EPMC6075788 | biostudies-literature
| S-EPMC6294148 | biostudies-literature
| S-EPMC6859861 | biostudies-literature
| S-EPMC6881452 | biostudies-literature
| S-EPMC7614754 | biostudies-literature
| S-EPMC8814036 | biostudies-literature
| S-EPMC6605414 | biostudies-literature
| S-EPMC10690204 | biostudies-literature