Project description:The data set consist of three different sources. 1) All files with ecoli_* derive from a pure culture of Escherichia coli K-12 (MG1655). 2) All files with SIHUMI_standard_* derive from a mixed culture of 8 bacteria (SIHUMIx) Anaerostipes caccae (DSMZ 14662), Bacteroides thetaiotaomicron (DSMZ 2079), Bifidobacterium longum (NCC 2705), Blautia producta (DSMZ 2950), Clostridium butyricum (DSMZ 10702), Clostridium ramosum (DSMZ 1402), Escherichia coli K-12 (MG1655) and Lactobacillus plantarum (DSMZ 20174). A standard proteomic protocol was used for purification. 3) All files with SIHUMI_small_* derive from the same bacteria culture as second source in contrast a variety of different proteomic protocols were used to enhance enrichment of small (<100 AS) Proteins. The goal of the project was to design a workflow to quickly prioritize novel protein candidates. The workflow was designed to be robust in a meta-omics context and facilitate the integration of transcriptomic and other information on a genomic level. The MS-data from the first source was used to test the workflow under well controlled conditions, namely in pure culture and near complete annotation. The workflow was used with data from the second source to see if good results can be produced in a mixed culture. To enhance the chances of finding novel proteins we incorporated the data from the third source.
Project description:Analyses of new genomic, transcriptomic or proteomic data commonly result in trashing many unidentified data escaping the ‘canonical’ DNA-RNA-protein scheme. Testing systematic exchanges of nucleotides over long stretches produces inversed RNA pieces (here named “swinger” RNA) differing from their template DNA. These may explain some trashed data. Here analyses of genomic, transcriptomic and proteomic data of the pathogenic Tropheryma whipplei according to canonical genomic, transcriptomic and translational 'rules' resulted in trashing 58.9% of DNA, 37.7% RNA and about 85% of mass spectra (corresponding to peptides). In the trash, we found numerous DNA/RNA fragments compatible with “swinger” polymerization. Genomic sequences covered by «swinger» DNA and RNA are 3X more frequent than expected by chance and explained 12.4 and 20.8% of the rejected DNA and RNA sequences, respectively. As for peptides, several match with “swinger” RNAs, including some chimera, translated from both regular, and «swinger» transcripts, notably for ribosomal RNAs. Congruence of DNA, RNA and peptides resulting from the same swinging process suggest that systematic nucleotide exchanges increase coding potential, and may add to evolutionary diversification of bacterial populations.