Project description:DNA methylation is an important regulator of genome function in the eukaryotes, but it is currently unclear if the same is true in prokaryotes. While regulatory functions have been demonstrated for a small number of bacteria, there have been no large-scale studies of prokaryotic methylomes and the full repertoire of targets and biological functions of DNA methylation remains unclear. Here we applied single-molecule, real-time sequencing to directly study the methylomes of 232 phylogenetically diverse prokaryotes. Collectively, we identified 834 methylated motifs, enabling the specific annotation of 415 DNA methyltransferases (MTases), and adding substantially to existing databases of MTase specificities. While the majority of MTases function as components of restriction-modification systems, 139 MTases have no cognate restriction enzyme in the genome, suggesting some other functional role. Several of these âorphanâ MTases are conserved across species and exhibit patterns of DNA methylation consistent with known regulatory MTases. Based on these patterns of methylation, we identify candidate novel regulators of gene expression in several phyla of bacteria, and candidate regulators of DNA replication in Haloarchaea. Together these data substantially advance our knowledge of DNA restriction-modification systems, and hint at a wider role for methylation in prokaryotic genome regulation. Single-molecule, real-time sequencing of DNA modifications across 232 diverse prokaryotic genomes.
Project description:Reanalysis of the human testis tissue dataset acquired for the Chromosome-centric Human Proteome Project (C-HPP; PRIDE project PXD002179), with a focus on identifying novel peptides for genome annotation.
Project description:Total RNA was isolated from wild type and Pax5-/- pro-B cells. Samples were sequenced on an Illumina Hiseq 2000 at the Beijing Genomics Institute to produce 90bp paired-end reads. Reads were aligned to the mm10 genome using Subread with unique alignment. The number of read pairs mapped to each gene in the NCBI RefSeq annotation was counted using the featureCounts function of the Rsubread package.
Project description:With a view to re-annotate the genome sequence of the nitrogen fixing bacterium Sinorhizobium meliloti, we generated oriented sequences of transcripts. To cover a large number of expressed genes we prepared RNA from bacteria grown in three very different physiological conditions including bacteria grown in liquid cultures (in both exponential and stationary growth phases) and from 10-day-old nodules in which bacteria were differentiated in nitrogen fixing bacteroids. The transcripts sequences were then integrated into EuGene-P, a new prokaryotic genome annotation tool able to integrate high throughput data including oriented RNA-Seq data directly into the prediction process, which led to the production of an accurate and complete annotation of the genome of S. meliloti strain 2011.
Project description:To optimize the genome annotation, two tissue RNA libraries (i.e. liver and muscle) were constructed using the Illumina mRNA-Seq Prep Kit This study is a part of the Pseudopodoces humilis WGS project (BioProject ID: PRJNA179234) and was used for gene annotation improvement.
Project description:Reanalysis of the human testis tissue dataset acquired for the Chromosome-centric Human Proteome Project (C-HPP; PRIDE project PXD002179), with a focus on identifying novel peptides for genome annotation.
Project description:Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated techniques depend heavily on sequence context and often underestimate the complexity of the proteome. We developed REPARATION (RibosomeE Profiling Assisted (Re-)AnnotaTION), a de novo algorithm that takes advantage of experimental evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation. Ribo-seq next generation sequencing technique that provides a genome-wide snapshot of the position translating ribosome along an mRNA at the time of the experiment. REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds to screen for spurious ORFs based on a growth curve model. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel ORFs including variants of previously annotated and novel small ORFs (<71 codons). Our predictions were supported by matching mass spectrometry (MS) proteomics data and sequence conservation analysis. REPARATION is unique in that it makes use of experimental Ribo-seq data to perform de novo ORF delineation in bacterial genomes, and thus can identify putative coding ORFs irrespective of the sequence context of the reading frame.
Project description:Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated techniques depend heavily on sequence context and often underestimate the complexity of the proteome. We developed REPARATION (RibosomeE Profiling Assisted (Re-)AnnotaTION), a de novo algorithm that takes advantage of experimental evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation. Ribo-seq next generation sequencing technique that provides a genome-wide snapshot of the position translating ribosome along an mRNA at the time of the experiment. REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds to screen for spurious ORFs based on a growth curve model. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel ORFs including variants of previously annotated and novel small ORFs (<71 codons). Our predictions were supported by matching mass spectrometry (MS) proteomics data and sequence conservation analysis. REPARATION is unique in that it makes use of experimental Ribo-seq data to perform de novo ORF delineation in bacterial genomes, and thus can identify putative coding ORFs irrespective of the sequence context of the reading frame.