Unknown

Dataset Information

0

N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana.


ABSTRACT: Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes.

SUBMITTER: Willems P 

PROVIDER: S-EPMC5461538 | biostudies-literature | 2017 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana.

Willems Patrick P   Ndah Elvis E   Jonckheere Veronique V   Stael Simon S   Sticker Adriaan A   Martens Lennart L   Van Breusegem Frank F   Gevaert Kris K   Van Damme Petra P  

Molecular & cellular proteomics : MCP 20170421 6


Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/M  ...[more]

Similar Datasets

2016-11-25 | GSE88790 | GEO
2017-04-24 | PXD004896 | Pride
| S-EPMC4014282 | biostudies-literature
| S-EPMC84964 | biostudies-literature
| S-EPMC2742825 | biostudies-literature
| S-EPMC6473545 | biostudies-literature
| S-EPMC3282757 | biostudies-literature
| S-EPMC10712242 | biostudies-literature
| S-EPMC193643 | biostudies-literature
| S-EPMC3313057 | biostudies-literature