ChIP-nexus data for Oct4, Sox2, Nanog and Klf4 in mouse embryonic stem cells
Ontology highlight
ABSTRACT: The goal of this study was discover the transcription binding synthax for the key differentiation TFs in mouse embryonic stem cells. Genes are regulated through enhancer sequences, in which transcription factor binding motifs and their specific arrangements (syntax) form a cis-regulatory code. To understand the relationship between motif syntax and transcription factor binding, we train a deep learning model that uses DNA sequence to predict base-resolution binding profiles of four pluripotency transcription factors Oct4, Sox2, Nanog, and Klf4. We interpret the model to accurately map hundreds of thousands of motifs in the genome, learn novel motif representations and identify rules by which motifs and syntax influence transcription factor binding. We find that instances of strict motif spacing are largely due to retrotransposons, but that soft motif syntax influences motif interactions at protein and nucleosome range. Most strikingly, Nanog binding is driven by motifs with a strong preference for ~10.5 bp spacings corresponding to helical periodicity. Interpreting deep learning models applied to high-resolution binding data is a powerful and versatile approach to uncover the motifs and syntax of cis-regulatory sequences.
ORGANISM(S): Mus musculus
PROVIDER: GSE137193 | GEO | 2020/11/21
REPOSITORIES: GEO
ACCESS DATA