Sequence basis of transcription initiation in human genome
Ontology highlight
ABSTRACT: Transcription initiation is an essential process for ensuring proper function of any gene, however, we still lack a unified understanding of sequence patterns and rules that explains most transcription initiation sites in human genome. By explaining transcription initiation at basepair resolution from sequence with a deep learning-inspired explainable modeling approach, here we show that simple rules can explain the vast majority of human promoters. We identified key sequence patterns that contribute to human promoter function, each activating transcription with a distinct position-specific effect curve that likely reflects its mechanism of promoting transcription initiation. Most of these position-specific effects have not been previously characterized, and we verified them using experimental perturbations of transcription factor binding sequences. We revealed the sequence basis of bidirectional transcription at promoters and the links between promoter selectivity and gene expression variation across cell types. Additionally, by analyzing 241 mammalian genomes and mouse transcription initiation site data, we showed that the sequence determinants are conserved across mammalian species. Taken together, we provide a unified model for the sequence basis of transcription initiation at basepair resolution(?) that is broadly applicable across mammalian species, which sheds new light on fundamental questions related to promoter sequence and function.
ORGANISM(S): Homo sapiens
PROVIDER: GSE248771 | GEO | 2024/07/10
REPOSITORIES: GEO
ACCESS DATA