Project description:This data displays both known and unknown extra-cellular proteins from 13 species of Lactic Acid bacteria found in the honey-crop of the honeybee Apis. mellifera mellifera. The tryptic peptides from the secreted proteins were run on an Agilent HPLC on a C18 reverse phase column (75 µm x 150 mm, particle size 3 µm). Total run time was 90 min and flow rate 300 nl/min. Buffers used for gradient was 0.1% formic acid in water (buffer A) and 0.1% formic acid in acetonitrile (buffer B). The buffer mixing was 5 min 5% buffer B, followed by 5%-45% buffer B in a linear gradient for 50 min, followed by 45%-80% buffer B in a linear gradient for 5 min. The 80% of buffer B was then kept for 15 min and then rapidly back to 5% buffer B for the final 15 min. The fractions from HPLC were loaded on an LCQ Deca XP Plus Ion trap mass spectrometer (ThermoScientific). Genomic DNA were prepared from all 13 LAB strains depicted earlier and sequenced at MWG Eurofins Operon (Ebensburg, Germany) using Roche GS FLX Titanium technology from Roche (Basel, Switzerland). For each genome a shotgun library was constructed with up to 700,000 reads per segment and was generated by sequencing in 2x half segment of a full FLX+ run. Each genome had an 8 kpb long-paired end library constructed. Approximately 300,000 true paired end reads, sequence tags, and scaffolds with GS FLX+ chemistry using 2x half segment of a full run were generated. Clonal amplification was performed by emPCR in both library types. The sequencing was continued until 15-20 fold coverage was reached. The obtained reads were assembled by the software Newbler 2.6 from Roche (Basel, Switzerland). ORF prediction and automated annotation was performed at Integrated Genomics Assets Inc. (Mount Prospect, Illinois, USA). In ORF prediction three different software were used, GLIMMER, Critica, and Prokpeg. Automated annotation was performed with the ERGOTM algorithms (Integrated Genomics Assets Inc. Mount Prospect, Illinois, USA). The resulting mass spectra-files obtained from the mass spectrometry analysis were searched using MASCOT against a local database containing the predicted proteome of the 13 LAB. We used a cut off Ions score of 38 as a value for determining that the protein was identified. Individual ion scores that were greater than 38 indicated identity or extensive homology (p<0.05) of the protein. Protein sequence similarity searches were performed with software BLASTP in the software package BLAST 2.27+ against a non-redundant protein database at NCBI. Pfam (default database), and InterProScan (default databases). Expressed proteins identified by peptide mass fingerprinting were manually re-annotated.