Project description:Gut microbiota were assessed in 540 colonoscopy-screened adults by 16S rRNA gene sequencing of stool samples. Investigators compared gut microbiota diversity, overall composition, and normalized taxon abundance among these groups.
Project description:This dataset contains Xdrop followed by oxford nanopore long read sequencing performed in target tRNA gene deletion clones in HAP1 (t72) and HepG2 (t15). By applying de novo assembly based approach to Xdrop-LRS data, we identified Cas9-induced on-target genomic alteration.
Project description:The impact of mono-chronic S. stercoralis infection on the gut microbiome and microbial activities in infected participants was explored. The 16S rRNA gene sequencing of a longitudinal study with 2 sets of human fecal was investigated. Set A, 42 samples were matched, and divided equally into positive (Pos) and negative (Neg) for S. stercoralis diagnoses. Set B, 20 samples of the same participant in before (Ss+PreT) and after (Ss+PostT) treatment was subjected for 16S rRNA sequences and LC-MS/MS to explore the effect of anti-helminthic treatment on microbiome proteomes.
Project description:This dataset contains Xdrop followed by oxford nanopore long read sequencing performed in target tRNA gene deletion (t8) and intergenic region deletion (i50) clones in HepG2 . By applying de novo assembly based approach to Xdrop-LRS data, we identified Cas9-induced on-target genomic alteration.
Project description:In this study we developed metaproteomics based methods for quantifying taxonomic composition of microbiomes (microbial communities). We also compared metaproteomics based quantification to other quantification methods, namely metagenomics and 16S rRNA gene amplicon sequencing. The metagenomic and 16S rRNA data can be found in the European Nucleotide Archive (Study number: PRJEB19901). For the method development and comparison of the methods we analyzed three types of mock communities with all three methods. The communities contain between 28 to 32 species and strains of bacteria, archaea, eukaryotes and bacteriophage. For each community type 4 biological replicate communities were generated. All four replicates were analyzed by 16S rRNA sequencing and metaproteomics. Three replicates of each community type were analyzed with metagenomics. The "C" type communities have same cell/phage particle number for all community members (C1 to C4). The "P" type communities have the same protein content for all community members (P1 to P4). The "U" (UNEVEN) type communities cover a large range of protein amounts and cell numbers (U1 to U4). We also generated proteomic data for four pure cultures to test the specificity of the protein inference method. This data is also included in this submission.
Project description:We generated a protein database directly from soil metaproteomic data by identifying the microbial composition using the Kaiko model's de novo sequencing methods. We first analyzed the mass spectra de novo (without a database), identifying species from the observed peptides. We next gathered full proteomic databases for the identified species and searched the mass spec data using MS-GF+ and this custom-assembled protein sequence database.
Project description:In this study, we performed a comparative analysis of gut microbiota composition and gut microbiome-derived bacterial extracellular vesicles (bEVs) isolated from patients with solid tumours and healthy controls. After isolating bEVs from the faeces of solid tumour patients and healthy controls, we performed spectrometry analysis of their proteomes and next-generation sequencing (NGS) of the 16S gene. We also investigated the gut microbiomes of faeces from patientsand controls using 16S rRNA sequencing. Machine learning was used to classify the samples into patients and controls based on their bEVs and faecal microbiomes.
Project description:Dependent on concise, pre-defined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results. Here, we present novel metrics for benchmarking de novo sequencing algorithm performance on large scale proteomics datasets, and present a method for accurately calibrating false discovery rates on de novo results. We also present a novel algorithm (LADS) which leverages experimentally disambiguated fragmentation spectra to boost sequencing accuracy and sensitivity. LADS improves sequencing accuracy on longer peptides relative to other algorithms and improves discriminability of correct and incorrect sequences. Using these advancements, we demonstrate accurate de novo identification of peptide sequences not identifiable using database search-based approaches.
Project description:De novo peptide sequencing is a fundamental research area in mass spectrometry (MS) based proteomics. However, those methods have often been evaluated using a couple of simple metrics that do not fully reflect their overall performance. Moreover, there has not been an established method to estimate the false discovery rate (FDR) and the significance of de novo peptide-spectrum matches (PSMs). Here we propose NovoBoard, a comprehensive framework to evaluate the performance of de novo peptide sequencing methods. The framework consists of diverse benchmark datasets (including tryptic, nontryptic, immunopeptidomics, and different species), and a standard set of accuracy metrics to evaluate the fragment ions, amino acids, and peptides of the de novo results. More importantly, a new approach is designed to evaluate de novo peptide sequencing methods on target-decoy spectra and to estimate their FDRs. Our results thoroughly reveal the strengths and weaknesses of different de novo peptide sequencing methods, and how their performances depend on specific applications and the types of data. Our FDR estimation also shows that some tools may perform better than the others in distinguishing between de novo PSMs and random matches, and can be used to assess the significance of de novo PSMs.
Project description:DNA methylation plays a critical role in development, particularly in repressing retrotransposons. The mammalian methylation landscape is dependent on the combined activities of the canonical maintenance enzyme Dnmt1 and the de novo Dnmts, 3a and 3b. Here we demonstrate that Dnmt1 displays de novo methylation activity in vitro and in vivo with specific retrotransposon targeting. We used whole-genome bisulfite and long-read Nanopore sequencing in genetically engineered methylation depleted embryonic stem cells to provide an in-depth assessment and quantification of this activity. Utilizing additional knockout lines and molecular characterization, we show that Dnmt1's de novo methylation activity depends on Uhrf1 and its genomic recruitment overlaps with targets that enrich for Trim28 and H3K9 trimethylation. Our data demonstrate that Dnmt1 can de novo add and maintain DNA methylation, especially at retrotransposons and that this mechanism may provide additional stability for long-term repression and epigenetic propagation throughout development.