Project description:Given the facilities for whole genome sequencing with next-generation sequencers, structural and functional gene annotation is now only based on automated prediction. However, errors in terms of gene structure are still frequently reported especially for the correct determination of initiation start codons. Here, we propose a strategy to enrich and detect protein N-termini by mass spectrometry in order to refine genome annotation. After selective protein N-termini derivatization using (N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPPAc-OSu) as labeling reagent, protein digestion was performed with three proteases in parallel. TMPP-labeled N-terminal-most peptides were further resolved from internal peptides by the COmbined FRActional DIagonal Chromatography (COFRADIC) sorting methodology before analysis with tandem mass spectrometry. We refined the annotation of the genome of a model marine bacterium, Roseobacter denitrificans.
Project description:Whole-genome sequencing is an important way to understand the genetic information, gene function, biological characteristics, and living mechanisms of organisms. There is no difficulty to have mega-level genomes sequenced at present. However, we encountered a hard-to-sequence genome of Pseudomonas aeruginosa phage PaP1. The shotgun sequencing method failed to dissect this genome. After insisting for 10 years and going over 3 generations of sequencing techniques, we successfully dissected the PaP1 genome with 91,715 bp in length. Single-molecule sequencing revealed that this genome contains lots of modified bases, including 51 N6-methyladenines (m6A) and 152 N4-methylcytosines (m4C). At the same time, further investigations revealed a novel immune mechanism of bacteria, by which the host bacteria can recognize and repel the modified bases containing inserts in large scale, and this led to the failure of the shotgun method in PaP1 genome sequencing. Strategy of resolving this problem is use of non-library dependent sequencing techniques or use of the nfi- mutant of E. coli DH5M-NM-1 as the host bacteria to construct the shotgun library. In conclusion, we unlock the mystery of phage PaP1 genome hard to be sequenced, and discover a new mechanism of bacterial immunity in present study. Methylation profiling of Pseudomonas aeruginosa phage PaP1 using kinetic data generated by single-molecule, real-time (SMRT) sequencing on the PacBio RS.