Unknown

Dataset Information

0

A generative model for constructing nucleic acid sequences binding to a protein.


ABSTRACT: BACKGROUND:Interactions between protein and nucleic acid molecules are essential to a variety of cellular processes. A large amount of interaction data generated by high-throughput technologies have triggered the development of several computational methods either to predict binding sites in a sequence or to determine whether a pair of sequences interacts or not. Most of these methods treat the problem of the interaction of nucleic acids with proteins as a classification problem rather than a generation problem. RESULTS:We developed a generative model for constructing single-stranded nucleic acids binding to a target protein using a long short-term memory (LSTM) neural network. Experimental results of the generative model are promising in the sense that DNA and RNA sequences generated by the model for several target proteins show high specificity and that motifs present in the generated sequences are similar to known protein-binding motifs. CONCLUSIONS:Although these are preliminary results of our ongoing research, our approach can be used to generate nucleic acid sequences binding to a target protein. In particular, it will help design efficient in vitro experiments by constructing an initial pool of potential aptamers that bind to a target protein with high affinity and specificity.

SUBMITTER: Im J 

PROVIDER: S-EPMC6933682 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

A generative model for constructing nucleic acid sequences binding to a protein.

Im Jinho J   Park Byungkyu B   Han Kyungsook K  

BMC genomics 20191227 Suppl 13


<h4>Background</h4>Interactions between protein and nucleic acid molecules are essential to a variety of cellular processes. A large amount of interaction data generated by high-throughput technologies have triggered the development of several computational methods either to predict binding sites in a sequence or to determine whether a pair of sequences interacts or not. Most of these methods treat the problem of the interaction of nucleic acids with proteins as a classification problem rather t  ...[more]

Similar Datasets

| S-EPMC7797056 | biostudies-literature
| S-EPMC4551922 | biostudies-literature
2020-01-02 | GSE93053 | GEO
| S-EPMC4245949 | biostudies-literature
| S-EPMC4271565 | biostudies-literature
| S-EPMC6728137 | biostudies-literature
| S-EPMC3398023 | biostudies-literature
| S-EPMC3258132 | biostudies-literature
2020-01-02 | GSE93052 | GEO
2020-01-02 | GSE93051 | GEO