Unknown

Dataset Information

0

DNA Motif Recognition Modeling from Protein Sequences.


ABSTRACT: Although the existing works on DNA motif discovery on DNA sequences are plethoric, mechanistic knowledge to infer DNA motifs from protein sequences across multiple DNA-binding domain families without conducting any wet-lab experiments is still lacking. Therefore, the k-spectrum recognition modeling is proposed to address the issues at the highest possible resolutions. The k-spectrum model can capture DNA motif patterns from protein sequences at the resolution in which local sequence context and nucleotide dependency can be taken into account completely. Multiple evaluation metrics are adopted and measured on millions of k-mer binding intensities from 92 proteins across 5 DNA-binding families (i.e., bHLH, bZIP, ETS, Forkhead, and Homeodomain), demonstrating its competitive edges. In addition, it not only can contribute to DNA motif recognition modeling but also can help prioritize the observed or even unobserved binding of single nucleotide variants on transcription factor binding sites in a genome-wide manner.

SUBMITTER: Wong KC 

PROVIDER: S-EPMC6153143 | biostudies-literature | 2018 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

DNA Motif Recognition Modeling from Protein Sequences.

Wong Ka-Chun KC  

iScience 20180910


Although the existing works on DNA motif discovery on DNA sequences are plethoric, mechanistic knowledge to infer DNA motifs from protein sequences across multiple DNA-binding domain families without conducting any wet-lab experiments is still lacking. Therefore, the k-spectrum recognition modeling is proposed to address the issues at the highest possible resolutions. The k-spectrum model can capture DNA motif patterns from protein sequences at the resolution in which local sequence context and  ...[more]

Similar Datasets

| S-EPMC1963340 | biostudies-literature
| S-EPMC3740630 | biostudies-literature
| S-EPMC2194741 | biostudies-literature
| S-EPMC5555490 | biostudies-literature
| S-EPMC1502558 | biostudies-literature
| S-EPMC2777381 | biostudies-literature
| S-EPMC4301712 | biostudies-literature
| S-EPMC6774301 | biostudies-literature
| S-EPMC5390925 | biostudies-literature
| S-EPMC4243042 | biostudies-literature