Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity
Ontology highlight
ABSTRACT: The DNA sequence preferences of the vast majority of eukaryotic transcription factors (TFs) are unknown. Using an approach designed to broadly sample both DNA-binding domain types and eukaryotic clades, we have determined DNA-binding motifs for 1,033 TFs from 131 diverse eukaryotes, encompassing 54 domain types. Closely related orthologs and paralogs typically have very similar sequence preferences; this property allows inference of motifs for roughly one third of the 166,851 known or predicted eukaryotic TFs. While the origins of most motifs can be dated to hundreds of millions of years ago, we also characterize more recent TF expansions. Sequences matching the motifs are enriched upstream of TSS in most eukaryotic lineages, and at informative eQTL SNPs in Arabidopsis promoters, demonstrating their utility in mapping transcriptional networks. The motifs are housed at http://cisbp.ccbr.utoronto.ca
ORGANISM(S): synthetic construct
PROVIDER: GSE53348 | GEO | 2014/08/01
SECONDARY ACCESSION(S): PRJNA232033
REPOSITORIES: GEO
ACCESS DATA