Sequence database choice significantly affects taxonomic and functional metaproteomic results in gut microbiota studies
Ontology highlight
ABSTRACT: Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics (the study of the whole protein complement of a microbial community) can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Here we present a systematic investigation of variables concerning database construction and annotation, and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. Taxonomic and functional results were revealed to be strongly database-dependent, especially when dealing with mouse samples. As a striking example, in mouse the Firmicutes/Bacteroidetes ratio varied up to 10-fold depending on the database used. Finally, we provide recommendations regarding metagenomic sequence processing aimed at maximizing gut metaproteome characterization, and contribute to identify an optimized pipeline for metaproteomic data analysis.
INSTRUMENT(S): LTQ Orbitrap Velos
ORGANISM(S): Human Gut Metagenome
TISSUE(S): Feces
SUBMITTER:
Alessandro Tanca
LAB HEAD: Sergio Uzzau
PROVIDER: PXD004039 | Pride | 2016-10-10
REPOSITORIES: Pride
ACCESS DATA