Project description:Bottom-up proteomics database search algorithms used for peptide identification cannot comprehensively identify posttranslational modifications (PTMs) in a single-pass because of high false discovery rates (FDRs). A new approach to database searching enables Global PTM (G-PTM) identification by exclusively looking for curated PTMs, thereby avoiding the FDR penalty experienced during conventional variable modification searches. We identified nearly 2500 unique, high-confidence modified peptides comprising 31 different PTM types in single-pass database searches.
Project description:Open (mass tolerant) search of tandem mass spectra shows great potential in the comprehensive detection of post-translational modifications in shotgun proteomics. However, this search strategy has not been widely used by the community, and one bottleneck of it is the lack of appropriate algorithms for automated and reliable post-processing of the coarse and error-prone search results. Here we present PTMiner, a software tool for confident filtering and localization of modifications (mass shifts) identified by open search. After mass-shift-grouped FDR control of peptide-spectrum matches (PSMs), PTMiner uses an empirical Bayesian method to localize modifications through iterative learning of the prior probabilities of each type of modification occurring on different amino acids. In the validation experiments on a large data set of simulated spectra, PTMiner effectively controlled the FDRs of individual modification groups, and achieved a total spectral identification rate four times higher than the classic FDR estimation method. At 1% real false localization rate (FLR), PTMiner localized 93.06% of the modifications, far higher than two used open search engines and the extended Ascore localization algorithm. We then used PTMiner to analyze the draft map of human proteome containing 25 million spectra from 30 tissues, and confidently identified over 1.7 million modified PSMs at 1% FDR and 1% FLR, which provided a system-wide view of both known and unknown modifications in the human proteome.
Project description:Bottom-up proteomics database search algorithms used for peptide identification cannot comprehensively identify posttranslational modifications (PTMs) in a single-pass because of high false discovery rates (FDRs). A new approach to database searching enables Global PTM (G-PTM) identification by exclusively looking for curated PTMs, thereby avoiding the FDR penalty experienced during conventional variable modification searches. We identified nearly 2500 unique, high-confidence modified peptides comprising 31 different PTM types in single-pass database searches. Male C57BL/6J (B6) and CAST/EiJ (CAST) mice were purchased from The Jackson Laboratories (Bar Harbor, Maine) and housed in an environmentally controlled vivarium at the University of Wisconsin Biochemistry Department. Mice were provided standard rodent chow (Purina no. 5008) and water ad libitum, and maintained on a 12-hour light/dark cycle (6 AM – 6 PM). At 10 weeks of age, mice were sacrificed by CO2 asphyxiation. All animal procedures were preapproved by the University of Wisconsin Animal Care and Use Committee.
Project description:Unraveling the complex structure and functioning of microbial communities is essential to accurately predict the impact of perturbations and/or environmental changes. From all molecular tools available today to resolve the dynamics of microbial communities, metaproteomics stands out, allowing the establishment of phenotype-genotype linkages. Despite its rapid development, this technology has faced many technical challenges that still hamper its potential power. How to maximize the number of protein identification, improve quality of protein annotation and provide reliable ecological interpretation, are questions of immediate urgency. In our study, we used a robust metaproteomic workflow combining two protein fractionation approaches (gel-based versus gel-free) and four protein search databases derived from the same metagenome to analyze the same seawater sample. The resulting eight metaproteomes provided different outcomes in terms of (i) total protein numbers, (ii) taxonomic structures, and (iii) protein functions. The characterization and/or representativeness of numerous proteins from ecologically relevant taxa such as Pelagibacterales, Rhodobacterales and Synechococcales, as well as crucial environmental processes, such as nutrient uptake, nitrogen assimilation, light harvesting and oxidative stress response were found to be particularly affected by the methodology. Our results provide clear evidences that the use of different protein search databases significantly alters the biological conclusions in both gel-free and gel-based approaches. Our findings emphasize the importance of diversifying the experimental workflow for a comprehensive metaproteomic study.
Project description:The small proteome has already been well explored in eukaryal and bacterial species, but so far, archaeal genomes have not yet been analysed broadly with a dedicated focus on small proteins. Here, we present a combinatorial approach, integrating experimental information from small protein-optimized mass spectrometry (MS) and ribosome profiling (Ribo-seq) to generate a high confidence inventory of small proteins in the model archaeon Haloferax volcanii. Translation was demonstrated for 67% of the annotated small coding sequences by both methods. Annotation-independent data analysis allowed for the prediction of 47 sites of ribosomal engagement outside known coding regions by Ribo-seq, seven of whom correspond to the eight un-annotated small proteins identified by a similar independent analysis of proteomic data. We also present independent evidence in vivo for the translation of a subset of small proteins (comprising both previously annotated and newly identified), underlining the validity of our identification scheme. Moreover, several of these translated sORFs are conserved in Haloferax and might have important functions. Based on our findings, we conclude that the small proteome of H. volcanii is larger than previously expected and that the combined use of mass spectrometry to detect protein presence with Ribo-seq to inform on translation is a powerful tool for the discovery of new small protein-coding genes in diverse organisms. This data-set contains the search results obtained from an MS-Fragger search against six-frame genome translation-derived database that were mapped to the genome by Stephan Fuchs “Salt & Pepper” software suite for bacterial proteogenomics.