Proteomics

Dataset Information

0

Long noncoding RNA (lncRNA), smORF encoded polypeptides (SEPs), NONCODE database, enrichment, mass spectrometry


ABSTRACT: Many small open reading frames (smORFs) embedded in lncRNA transcripts have been shown to encode biologically functional polypeptides (smORFs-encoded polypeptides, SEPs) in different organisms. Despite significant advances in genomics, bioinformatics and proteomics that largely enabled the discovery of novel SEPs, their identification across different biological samples is still hampered by their poor predictability, diminutive size and low relative abundance. Here, we take advantage of NONCODE, a repository containing the most complete collection and annotation of lncRNA transcripts from different species, to build a novel database that attempts to maximize a collection of SEPs from human and mouse lncRNA transcripts. In order to further improve SEP discovery, we implemented two effective and complementary polypeptide enrichment strategies, 30 kDa MWCO filter and C8 SPE column. These combined strategies enabled us to discover 357 and 409 SEPs from, respectively, 8 human cell lines, and 3 mouse cell lines and 8 mouse tissues. Importantly, nineteen of the identified SEPs were then verified through in-vitro expression, immunoblotting, parallel reaction monitoring (PRM) and synthetic peptides. Subsequent bioinformatic analysis revealed that some of the physical and chemical properties of these novel SEPs, including amino acid composition and codon usage, are different from those commonly found in canonical proteins. Intriguingly, nearly 65% of the identified SEPs were found to be initiated with non-AUG start codons. Overall, the strategy presented in this study encompasses an efficient workflow that enabled us to identify 766 novel SEPs across multiple cell lines and tissues, which probably represents the largest number of SEPs detected by mass spectrometry reported to date. These novel SEPs might not only provide new clues for the annotation of noncoding elements in the genome but can also serve as a valuable resource for the functional characterization of individual SEPs.

INSTRUMENT(S): Q Exactive

ORGANISM(S): Homo Sapiens (human) Mus Musculus (mouse)

TISSUE(S): Spleen, Testis, Heart, Lung, Liver, Kidney

SUBMITTER: Qing Zhang  

LAB HEAD: Fuquan Yang

PROVIDER: PXD019486 | Pride | 2021-06-15

REPOSITORIES: Pride

Dataset's files

Source:
Action DRS
20161202_Hela_1_D.raw Raw
20161202_Hela_1_D_Merged.msf Msf
20161202_Hela_1_unDigestion.raw Raw
20161202_Hela_1_unDigestion_Merged.msf Msf
20161202_Hela_2_D.raw Raw
Items per page:
1 - 5 of 314
altmetric image

Publications

Deeply Mining a Universe of Peptides Encoded by Long Noncoding RNAs.

Zhang Qing Q   Wu Erzhong E   Tang Yiheng Y   Cai Tanxi T   Zhang Lili L   Wang Jifeng J   Hao Yajing Y   Zhang Bao B   Zhou Yue Y   Guo Xiaojing X   Luo Jianjun J   Chen Runsheng R   Yang Fuquan F  

Molecular & cellular proteomics : MCP 20210612


Many small ORFs embedded in long noncoding RNA (lncRNA) transcripts have been shown to encode biologically functional polypeptides (small ORF-encoded polypeptides [SEPs]) in different organisms. Despite some novel SEPs have been found, the identification is still hampered by their poor predictability, diminutive size, and low relative abundance. Here, we take advantage of NONCODE, a repository containing the most complete collection and annotation of lncRNA transcripts from different species, to  ...[more]

Similar Datasets

2023-05-06 | PXD016981 | Pride
2017-10-17 | PXD005643 | Pride
2015-03-05 | E-GEOD-59487 | biostudies-arrayexpress
2012-11-14 | E-GEOD-34740 | biostudies-arrayexpress
2015-03-05 | E-GEOD-66172 | biostudies-arrayexpress
2014-12-03 | E-GEOD-62032 | biostudies-arrayexpress
2021-01-11 | GSE164239 | GEO
2008-01-04 | E-TABM-404 | biostudies-arrayexpress
2021-07-16 | PXD024952 | Pride
2018-11-16 | E-MTAB-6203 | biostudies-arrayexpress