Unknown

Dataset Information

0

A Meta-proteogenomic Approach to Peptide Identification Incorporating Assembly Uncertainty and Genomic Variation.


ABSTRACT: Matching metagenomic and/or metatranscriptomic data, currently often under-used, can be useful reference for metaproteomic tandem mass spectra (MS/MS) data analysis. Here we developed a software pipeline for identification of peptides and proteins from metaproteomic MS/MS data using proteins derived from matching metagenomic (and metatranscriptomic) data as the search database, based on two novel approaches Graph2Pro (published) and Var2Pep (new). Graph2Pro retains and uses uncertainties of metagenome assembly for reference-based MS/MS data analysis. Var2Pep considers the variations found in metagenomic/metatranscriptomic sequencing reads that are not retained in the assemblies (contigs). The new software pipeline provides one stop application of both tools, and it supports the use of metagenome assembly from commonly used assemblers including MegaHit and metaSPAdes. When tested on two collections of multi-omic microbiome data sets, our pipeline significantly improved the identification rate of the metaproteomic MS/MS spectra by about two folds, comparing to conventional contig- or read-based approaches (the Var2Pep alone identified 5.6% to 24.1% more unique peptides, depending on the data set). We also showed that identified variant peptides are important for functional profiling of microbiomes. All results suggested that it is important to take into consideration of the assembly uncertainties and genomic variants to facilitate metaproteomic MS/MS data interpretation.

SUBMITTER: Li S 

PROVIDER: S-EPMC6692780 | biostudies-literature | 2019 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Meta-proteogenomic Approach to Peptide Identification Incorporating Assembly Uncertainty and Genomic Variation.

Li Sujun S   Tang Haixu H   Ye Yuzhen Y  

Molecular & cellular proteomics : MCP 20190529 8 suppl 1


Matching metagenomic and/or metatranscriptomic data, currently often under-used, can be useful reference for metaproteomic tandem mass spectra (MS/MS) data analysis. Here we developed a software pipeline for identification of peptides and proteins from metaproteomic MS/MS data using proteins derived from matching metagenomic (and metatranscriptomic) data as the search database, based on two novel approaches Graph2Pro (published) and Var2Pep (new). Graph2Pro retains and uses uncertainties of meta  ...[more]

Similar Datasets

| S-EPMC5757074 | biostudies-literature
| S-EPMC5062161 | biostudies-literature
| S-EPMC3293365 | biostudies-other
| S-EPMC10494315 | biostudies-literature
| S-EPMC5148723 | biostudies-literature
| S-EPMC8519820 | biostudies-literature
| S-EPMC9694666 | biostudies-literature
2011-08-02 | GSE31072 | GEO
| S-EPMC3145493 | biostudies-literature
| S-EPMC2842268 | biostudies-literature