Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

0

K562 polyA RNA-Seq


ABSTRACT: RNA-Seq reads and TopHat (Trapnell et al. Bioinformatics 2009) alignments of K562 cell-line transcriptome. These were used to validate the expression of short peptides idenitified by Mass-Spectrometry in K562 cells. K562 polyA+ RNA (Batch 1) and total RNA (batch 2) was purchased from Ambion. We used oligo (dT)-selected polyA+ RNA to construct libraries for RNA-Seq.We then profiled the transcriptome of polyadenylated mRNA-Seq using Illumina sequencing platforms. We then used the sequenced reads to reconstruct the transcriptome using the Cufflinks de-novo assembler (Trapnell et al. Nat.Bio.Tech. 2010). Recent computational and ribosome profiling analyses suggest that many short open reading frames (sORFs) in eukaryotic genomes are translated. However, evidence that these sORFs produce stable polypeptides is lacking. Here we develop a strategy to discover and validate novel sORF-encoded polypeptides (SEPs) in human cells. In total, we detect 117 SEPs, 114 of which are novel, varying in length from 15 to 149 amino acids. Of these, 10 SEPs (0.5%) are derived from long intergenic non-coding RNAs (lincRNAs). We also observe the presence of polycistronic genes and the widespread use of non-AUG start codons, which is a phenomenon historically thought to be rare in the mammalian genome. Quantitative measurements reveal that SEPs can be found at concentrations between ~10-2000 copies per cell, which is within the range of typical cellular proteins. We confirm the translation of a number of these SEPs through heterologous expression of their encoding cDNAs. We also discover that several SEPs possess properties characteristic of functional proteins. These results demonstrate that human sORFs produce numerous stable polypeptides, revealing that the human proteome is larger and more diverse than previously appreciated.

ORGANISM(S): Homo sapiens

SUBMITTER: Nataly Cabili 

PROVIDER: E-GEOD-34740 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

altmetric image

Publications

Peptidomic discovery of short open reading frame-encoded peptides in human cells.

Slavoff Sarah A SA   Mitchell Andrew J AJ   Schwaid Adam G AG   Cabili Moran N MN   Ma Jiao J   Levin Joshua Z JZ   Karger Amir D AD   Budnik Bogdan A BA   Rinn John L JL   Saghatelian Alan A  

Nature chemical biology 20121118 1


The complete extent to which the human genome is translated into polypeptides is of fundamental importance. We report a peptidomic strategy to detect short open reading frame (sORF)-encoded polypeptides (SEPs) in human cells. We identify 90 SEPs, 86 of which are previously uncharacterized, which is the largest number of human SEPs ever reported. SEP abundances range from 10-1,000 molecules per cell, identical to abundances of known proteins. SEPs arise from sORFs in noncoding RNAs as well as mul  ...[more]

Similar Datasets

2012-11-14 | GSE34740 | GEO
| PRJNA150405 | ENA
| PRJNA607080 | ENA
2018-06-15 | PXD008586 | Pride
2023-11-21 | PXD041421 | Pride
2012-05-09 | E-GEOD-36026 | biostudies-arrayexpress
2011-06-27 | E-GEOD-22800 | biostudies-arrayexpress
2012-05-10 | GSE37909 | GEO
| PRJNA876576 | ENA
2023-02-14 | GSE206492 | GEO