Project description:Ocean microbiome dataset published by May et al. [May2016] and the corresponding peptide identifications. The LC-MS/MS spectra are from triplicate acquisitions of peptides, acquisitions 51-53 from the Bering Strait (BSt) and acquisitions 45-47 from the Chukchi Sea (CS). For each sampling location, there are two sets of peptide identifications: one based on a metapeptide database specific to the location and one based on a non-redundant environmental database. Peptide identifications were obtained with Tide search and Percolator as described in Yilmaz et al. [Yilmaz2023].
[May2016] May, D. H. et al. "An Alignment-Free Metapeptide Strategy for Metaproteomic Characterization of Microbiome Samples Using Shotgun Metagenomic Sequencing", Journal of Proteome Research, 2016.
[Yilmaz2023] Yilmaz, Melih et al. "Sequence-to-sequence translation from mass spectra to peptides with a transformer model", bioRxiv, 2023.
Project description:Purpose: RNA-sequencing (RNA-seq) was used to identify the changes in gene expression profile to describe the metabolism adaptation at the whole transcriptome of blood to endurance effort. Samples from ten Arabian horses were taken before and after a 120km long endurance ride.
Project description:Purpose: Next-generation sequencing (NGS) was used to select genes potentially associated with exercise adaptation in Arabian horses. Methods: Whole transcriptome profiling of blood was performed for untrained horses and horses from which samples were collected during at 3 different periods of training procedure (T1-during intense training period - March, T2- before starts - May and T3 -after flat racing season - October). The muscle transcriptome sequencing was performed for 37 blood samples using Illumina HiScan SQ in 75 single-end cycles. The quantifying transcript abundances was made using the RSEM supported by STAR aligner. The raw reads were aligned to the Equus caballus reference genome. Differentially expressed genes in blood tissue were detected by DESeq2. The RNA-seq results were validated using by qPCR. Results: The increase of the number of DEGs between subsequent training periods has been observed and the highest amount of DEGs was detected between untrained horses (T0) and horses at the end of the racing season (T3) â 440. The comparison of transcriptome of T2 vs T3 and T0 vs T3 showed a significant advantage of up-regulated genes during long-term exercise (up-regulation of 266 and 389 DEGs in T3 period compared T2 and T0; respectively). Our results showed that the largest number of identified genes encoded transcription factors, nucleic acid binding proteins and G-protein modulators, which mainly were transcriptional activated at the last training phase (T3) . Moreover, in the T3 period the identified DEGs represented genes coded for cytoskeletal proteins including actin cytoskeletal proteins and kinases. The most abundant exercise-upregulated genes were involved in pathways important in regulating the cell cycle (PI3K-Akt signaling pathway), cell communication (cAMP-dependent pathway), proliferation, differentiation and apoptosis as well as immunity processes (Jak-STAT signaling pathway). We also observed exercise induced expression of genes related in regulation of actin cytoskeleton, gluconeogenesis (FoxO signaling pathway; Insulin signaling pathway), glycerophospholipid metabolism and calcium signaling. Conclusions: TOur results allow to identify changes in genes expression profile following training schedule in Arabian horses. Based on comparison analysis of blood transcriptomes, several exercise-regulated pathways and genes most affected by exercise were detected. We pinpointed overrepresented molecular pathways and genes essential for exercise adaptive response via maintaining of body homeostasis. The observed transcriptional activation of such gene as LPGAT1, AGPAT5, PIK3CG, GPD2, FOXN2, FOXO3, ACVR1B and ACVR2A can be a base for further research in order to identify genes potentially associated with race performance in Arabian horses. Such markers will be essential to choice the training type, and could result in differences in racing performance specific to various breeds. The blood transcriptome sequencing was performed for 37 samples collected form Arabian horses using Illumina HiScan SQ in75 single-end cycles and in 3-4 technical repetitions.repetitions.
Project description:Ocean microbiome dataset published by [May2016] and the corresponding database search results. The LC-MS/MS spectra are from triplicate acquisitions of peptides, acquisitions 51-53 from the Bering Strait (BSt) and acquisitions 45-47 from the Chukchi Sea (CS). For each sampling location, there are two sets of spectrum identifications: one based on a metapeptide database specific to the location (metapeptides_BSt and metapeptides_CS) and one based on a non-redundant environmental database (env_nr). Spectrum identifications were obtained with Tide and Percolator as described in [Yilmaz2023]. Casanovo predictions for this dataset are provided in MSV000093980, alongside Casanovo predictions for other datasets. ________________________________ PUBLICATIONS: [May2016] May, D. H. et al. "An Alignment-Free Metapeptide Strategy for Metaproteomic Characterization of Microbiome Samples Using Shotgun Metagenomic Sequencing." Journal of Proteome Research. 2016. [Yilmaz2023] Yilmaz, Melih et al. "Sequence-to-sequence translation from mass spectra to peptides with a transformer model." Nature Communications. 2024. ________________________________ SPECTRUM FILES: The dataset contains the following six spectrum files, three from the Chukchi Sea (2016_Jan_12_QE2_45.mzXML, 2016_Jan_12_QE2_46.mzXML, 2016_Jan_12_QE2_47.mzXML) and three from the Bering Strait (2016_Jan_12_QE3_51.mzXML, 2016_Jan_12_QE3_52.mzXML, 2016_Jan_12_QE3_53.mzXML). ________________________________ FASTA FILES: The dataset containes three protein fasta files: Bering Strait proteins in metapeptides_BSt.fasta, Chukchi Sea proteins in metapeptides_CS.fasta, and the environmental protein database in env_nr.fasta. ________________________________ SEARCH FILES: Associated with each FASTA file is a tide-index log file with names of the form <database>.tide-index.log.txt. The dataset contains Tide output files for 12 searches (six spectrum files, each searched against two databases). For each search, the corresponding tide-search primary output files have names like <sample>.<database>.tide-search.target.txt. There are also corresponding log files and parameter files with names like <sample>.<database>.tide-search.log.txt and <sample>.<database>.tide-search.params.txt. ________________________________ PERCOLATOR FILES: The dataset contains four sets of Percolator output files. The Percolator PSM-level output files are named <location>.<database>.percolator.target.psms.txt, where <location> is "BSt" for Bering Strait and "CS" for Chukchi Sea, and <database> is "metapeptide_BSt", "metapeptide_CS" or "env_nr". The peptide-level output files are <location>.<database>.percolator.target.peptides.txt. The corresponding log files are <location>.<database>.percolator.log.txt. And the lists of peptides accepted at 1% FDR are <location>.<database>.peptides.q01.txt. ________________________________ CASANOVO FILES: Casanovo peptide predictions for this dataset reside in MSV000093980, and they are organized into six mzTab files where each file is named after the corresponding spectrum file. (e.g. 2016_Jan_12_QE2_45.mztab)