High-throughput functional analysis of lncRNA core promoters elucidates rules governing tissue specificity
Ontology highlight
ABSTRACT: Bidirectional transcription initiates at both coding and non-coding genomic elements, including mRNA and long non-coding RNA (lncRNA) promoters and enhancer RNAs (eRNAs). However, each class has different tissue-specific expression profiles with lncRNAs and eRNAs being the most tissue-specific. How these complex differences in expression profiles and tissue-specificities are encoded in a single DNA sequence, however, remains an open question. Here, we address this question using multiple computational and experimental approaches, including massively parallel reporter assays (MPRA). As most transcription factors (TFs) are enriched near the transcription start sites (TSSs) of both promoters and enhancers, we focus our analyses on these core promoter regions. We find that divergent lncRNA and mRNA core promoters have higher capacities to drive transcription than non-divergent lncRNA and mRNA core promoters, respectively. Conversely, lincRNAs and eRNAs are more tissue-specific than divergent genes. This higher tissue-specificity is strongly associated with having less complex TF motif profiles at the core promoter. We confirm these findings using single-nucleotide deletions in MPRA and we identify specific TFs regulating a set of disease-related lncRNAs. Finally, we assess the effects of genetic variation at core promoters and find that 22% of common single nucleotide polymorphisms show significant regulatory effects. Collectively, our findings characterize the important role of core promoter sequences in determining expression levels across both coding and non-coding gene classes and highlight an unexpected role of TF motif architecture in explaining the more restricted expression patterns of lncRNAs and eRNAs.
ORGANISM(S): synthetic construct Homo sapiens
PROVIDER: GSE117594 | GEO | 2019/01/08
REPOSITORIES: GEO
ACCESS DATA