Unknown

Dataset Information

0

Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs.


ABSTRACT: Long noncoding RNAs (lncRNAs) can exert their function by interacting with the DNA via triplex structure formation. Even though this has been validated with a handful of experiments, a genome-wide analysis of lncRNA-DNA binding is needed. In this paper, we develop and interpret deep learning models that predict the genome-wide binding sites deciphered by ChIRP-Seq experiments of 12 different lncRNAs. Among the several deep learning architectures tested, a simple architecture consisting of two convolutional neural network layers performed the best suggesting local sequence patterns as determinants of the interaction. Further interpretation of the kernels in the model revealed that these local sequence patterns form triplex structures with the corresponding lncRNAs. We uncovered several novel triplexes forming domains (TFDs) of these 12 lncRNAs and previously experimentally verified TFDs of lncRNAs HOTAIR and MEG3. We experimentally verified such two novel TFDs of lncRNAs HOTAIR and TUG1 predicted by our method (but previously unreported) using Electrophoretic mobility shift assays. In conclusion, we show that simple deep learning architecture can accurately predict genome-wide binding sites of lncRNAs and interpretation of the models suggest RNA:DNA:DNA triplex formation as a viable mechanism underlying lncRNA-DNA interactions at genome-wide level.

SUBMITTER: Wang F 

PROVIDER: S-EPMC6333433 | biostudies-literature | 2018

REPOSITORIES: biostudies-literature

altmetric image

Publications

Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs.

Wang Fan F   Chainani Pranik P   White Tommy T   Yang Jin J   Liu Yu Y   Soibam Benjamin B  

RNA biology 20181128 12


Long noncoding RNAs (lncRNAs) can exert their function by interacting with the DNA via triplex structure formation. Even though this has been validated with a handful of experiments, a genome-wide analysis of lncRNA-DNA binding is needed. In this paper, we develop and interpret deep learning models that predict the genome-wide binding sites deciphered by ChIRP-Seq experiments of 12 different lncRNAs. Among the several deep learning architectures tested, a simple architecture consisting of two co  ...[more]

Similar Datasets

| S-EPMC6451187 | biostudies-literature
| S-EPMC8163536 | biostudies-literature
2019-01-15 | GSE119638 | GEO
| PRJNA489807 | ENA
| S-EPMC9490101 | biostudies-literature
| S-EPMC9849350 | biostudies-literature
| S-EPMC4428586 | biostudies-literature
| S-EPMC8774129 | biostudies-literature
| S-EPMC5669865 | biostudies-literature
| S-EPMC5566112 | biostudies-other