Learning the sequence code for mRNA and protein abundance in human immune cells
Ontology highlight
ABSTRACT: mRNA and protein abundance are defined by transcriptional and post-transcriptional regulatory mechanisms. Here, we develop a machine learning pipeline, termed SONAR, to decipher the endogenous sequence code that determines mRNA and protein abundance in human cells. SONAR models predict up to 62% of mRNA and 63% of protein abundance independent of promoter or enhancer information, and reveal a strong—yet dynamic—cell-type specific sequence code. We also find that the effect of sequence features is dependent on their location within the mRNA transcript. Using SONAR, we design synthetic 3’UTRs, with which protein expression levels can be manipulated and tailored to a specific cell-type. Beyond its fundamental findings, our work provides novel means to improve immunotherapies and biotechnology applications.
ORGANISM(S): Homo sapiens
PROVIDER: GSE240919 | GEO | 2023/09/20
REPOSITORIES: GEO
ACCESS DATA