Unknown

Dataset Information

0

Experimental annotation of post-translational features and translated coding regions in the pathogen Salmonella Typhimurium.


ABSTRACT: Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. However, determining protein-coding genes for most new genomes is almost completely performed by inference using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function.We experimentally annotated the bacterial pathogen Salmonella Typhimurium 14028, using "shotgun" proteomics to accurately uncover the translational landscape and post-translational features. The data provide protein-level experimental validation for approximately half of the predicted protein-coding genes in Salmonella and suggest revisions to several genes that appear to have incorrectly assigned translational start sites, including a potential novel alternate start codon. Additionally, we uncovered 12 non-annotated genes missed by gene prediction programs, as well as evidence suggesting a role for one of these novel ORFs in Salmonella pathogenesis. We also characterized post-translational features in the Salmonella genome, including chemical modifications and proteolytic cleavages. We find that bacteria have a much larger and more complex repertoire of chemical modifications than previously thought including several novel modifications. Our in vivo proteolysis data identified more than 130 signal peptide and N-terminal methionine cleavage events critical for protein function.This work highlights several ways in which application of proteomics data can improve the quality of genome annotations to facilitate novel biological insights and provides a comprehensive proteome map of Salmonella as a resource for systems analysis.

SUBMITTER: Ansong C 

PROVIDER: S-EPMC3174948 | biostudies-literature | 2011 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications


<h4>Background</h4>Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. However, determining protein-coding genes for most new genomes is almost completely performed by inference using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function.<h4>Results</h4>We exper  ...[more]

Similar Datasets

| S-EPMC2926782 | biostudies-literature
2010-07-07 | GSE22625 | GEO
2010-07-07 | E-GEOD-22625 | biostudies-arrayexpress
| S-EPMC3207942 | biostudies-literature
| S-EPMC5977906 | biostudies-literature
| S-EPMC7275070 | biostudies-literature
| S-EPMC4481266 | biostudies-literature
| S-EPMC98782 | biostudies-literature
2023-12-15 | GSE119965 | GEO
2023-12-14 | GSE119967 | GEO