Bering Strait surface water and Chukchi Sea bottom water microbiome metaproteomics
Ontology highlight
ABSTRACT: Ocean microbiome dataset published by [May2016] and the corresponding database search results. The LC-MS/MS spectra are from triplicate acquisitions of peptides, acquisitions 51-53 from the Bering Strait (BSt) and acquisitions 45-47 from the Chukchi Sea (CS). For each sampling location, there are two sets of spectrum identifications: one based on a metapeptide database specific to the location (metapeptides_BSt and metapeptides_CS) and one based on a non-redundant environmental database (env_nr). Spectrum identifications were obtained with Tide and Percolator as described in [Yilmaz2023]. Casanovo predictions for this dataset are provided in MSV000093980, alongside Casanovo predictions for other datasets. ________________________________ PUBLICATIONS: [May2016] May, D. H. et al. "An Alignment-Free Metapeptide Strategy for Metaproteomic Characterization of Microbiome Samples Using Shotgun Metagenomic Sequencing." Journal of Proteome Research. 2016. [Yilmaz2023] Yilmaz, Melih et al. "Sequence-to-sequence translation from mass spectra to peptides with a transformer model." Nature Communications. 2024. ________________________________ SPECTRUM FILES: The dataset contains the following six spectrum files, three from the Chukchi Sea (2016_Jan_12_QE2_45.mzXML, 2016_Jan_12_QE2_46.mzXML, 2016_Jan_12_QE2_47.mzXML) and three from the Bering Strait (2016_Jan_12_QE3_51.mzXML, 2016_Jan_12_QE3_52.mzXML, 2016_Jan_12_QE3_53.mzXML). ________________________________ FASTA FILES: The dataset containes three protein fasta files: Bering Strait proteins in metapeptides_BSt.fasta, Chukchi Sea proteins in metapeptides_CS.fasta, and the environmental protein database in env_nr.fasta. ________________________________ SEARCH FILES: Associated with each FASTA file is a tide-index log file with names of the form .tide-index.log.txt. The dataset contains Tide output files for 12 searches (six spectrum files, each searched against two databases). For each search, the corresponding tide-search primary output files have names like ..tide-search.target.txt. There are also corresponding log files and parameter files with names like ..tide-search.log.txt and ..tide-search.params.txt. ________________________________ PERCOLATOR FILES: The dataset contains four sets of Percolator output files. The Percolator PSM-level output files are named ..percolator.target.psms.txt, where is "BSt" for Bering Strait and "CS" for Chukchi Sea, and is "metapeptide_BSt", "metapeptide_CS" or "env_nr". The peptide-level output files are ..percolator.target.peptides.txt. The corresponding log files are ..percolator.log.txt. And the lists of peptides accepted at 1% FDR are ..peptides.q01.txt. ________________________________ CASANOVO FILES: Casanovo peptide predictions for this dataset reside in MSV000093980, and they are organized into six mzTab files where each file is named after the corresponding spectrum file. (e.g. 2016_Jan_12_QE2_45.mztab)
INSTRUMENT(S): Q Exactive HF
ORGANISM(S): Various Species In Metaproteomics Samples
SUBMITTER: William Noble
PROVIDER: MSV000094709 | MassIVE | Tue May 07 21:30:00 BST 2024
REPOSITORIES: MassIVE
ACCESS DATA