Project description:Helicobacter pylori, which is known as pathogens of various gastric diseases, have many types of genome sequence variants. That is part of the reason why pathogenesis and infection mechanisms of the H. pylori-driven gastric diseases have not been well clarified yet. Here we performed a large-scale proteome analysis to profile the heterogeneity of the proteome expression of 7 H. pylori strains by using LC/MS/MS-based proteomics approach combined with a customized database consisting of non-redundant tryptic peptide sequences derived from full genome sequences of 52 H. pylori strains. The non-redundant peptide database enabled us to identify more peptides in the database search of MS/MS data, compared with a simply merged protein database. Using the approach we performed proteome analysis of genome-unknown strains of H. pylori in as large-scale as genome-known ones. Clustering of the H. pylori strains using the proteome profiling slightly differed from the genome profiling and more clearly divided the strains into two groups based on the isolated area. Furthermore, we also identified phosphorylated proteins and sites of the H. pylori strains and obtained phosphorylation motif located in the N-terminus, which are commonly observed in bacteria.