Project description:Bacteria belonging to phylum Gemmatimonadetes are found in a wide variety of environments and are particularly abundant in soils. To date, only two Gemmatimonadetes strains have been characterized. Here we report the complete genome sequence and methylation pattern of Gemmatirosa kalamazoonensis KBS708 (ATCC BAA-2150; NCCB 100411), the first characterized Gemmatimondetes strain isolated from soil. Examination of the methylome of Gemmatirosa kalamazoonenis KBS708 using kinetic data from single-molecule, real-time (SMRT) sequencing on the PacBio RS
Project description:Propionibacterium freudenreichii is an important starter culture used in the manufacture of Swiss-type cheeses. We have generated the complete genome sequence of a Propionibacterium freudenreichii ssp. shermanii strain JS at the Institute of Biotechnology, University of Helsinki, by using a combination of pyrosequencing with GS FLX and GS FLX Titanium series reagents (Roche) and SOLiD 4 (Life Technologies), ABI 3130xl Genetic Analyzer (Life Technologies), and PacBio RS II (Pacific Biosciences) instruments. Initial genome annotation was carried out using RAST, and additional functional annotation information for each CDS was obtained from BLANNOTATOR, CDD, and KAAS. Accession number for genome sequence is PRJEB12148. This submission is for the transcriptome analysis of Propionibakcterium freudenreichii in cheese ripening under warm and cold conditions. The RNA reads were mapped to the reference genome PRJEB12148.
Project description:Whole-genome sequencing is an important way to understand the genetic information, gene function, biological characteristics, and living mechanisms of organisms. There is no difficulty to have mega-level genomes sequenced at present. However, we encountered a hard-to-sequence genome of Pseudomonas aeruginosa phage PaP1. The shotgun sequencing method failed to dissect this genome. After insisting for 10 years and going over 3 generations of sequencing techniques, we successfully dissected the PaP1 genome with 91,715 bp in length. Single-molecule sequencing revealed that this genome contains lots of modified bases, including 51 N6-methyladenines (m6A) and 152 N4-methylcytosines (m4C). At the same time, further investigations revealed a novel immune mechanism of bacteria, by which the host bacteria can recognize and repel the modified bases containing inserts in large scale, and this led to the failure of the shotgun method in PaP1 genome sequencing. Strategy of resolving this problem is use of non-library dependent sequencing techniques or use of the nfi- mutant of E. coli DH5M-NM-1 as the host bacteria to construct the shotgun library. In conclusion, we unlock the mystery of phage PaP1 genome hard to be sequenced, and discover a new mechanism of bacterial immunity in present study. Methylation profiling of Pseudomonas aeruginosa phage PaP1 using kinetic data generated by single-molecule, real-time (SMRT) sequencing on the PacBio RS.
Project description:These data correspond to one SMRT cell sequencing run (performed on Sequel II, PacBio) of full length cDNAs from 3 pooled glioma stem cell line libraries. No tag was added to distinguish the 3 different samples
Project description:Six bacterial genomes, Geobacter metallireducens GS-15, Chromohalobacter salexigens, Vibrio breoganii 1C-10, Bacillus cereus ATCC 10987, Campylobacter jejuni subsp. jejuni 81-176 and Campylobacter jejuni NCTC 11168, all of which had previously been sequenced using other platforms were re-sequenced using single-molecule, real-time (SMRT) sequencing specifically to analyze their methylomes. In every case a number of new N6-methyladenine (m6A) and N4-methylcytosine (m4C) methylation patterns were discovered and the DNA methyltransferases (MTases) responsible for those methylation patterns were assigned. In 15 cases it was possible to match MTase genes with MTase recognition sequences without further sub-cloning. Two Type I restriction systems required sub-cloning to differentiate their recognition sequences, while four MTases genes that were not expressed in the native organism were sub-cloned to test for viability and recognition sequences. No attempt was made to detect 5-methylcytosine (m5C) recognition motifs from the SMRT sequencing data because this modification produces weaker signals using current methods. However, all predicted m6A and m4C MTases were detected unambiguously. This study shows that the addition of SMRT sequencing to traditional sequencing approaches gives a wealth of useful functional information about a genome showing not only which MTase genes are active, but also revealing their recognition sequences. Examination of the methylomes of six different strains of bacteria using kinetic data from single-molecule, real-time (SMRT) sequencing on the PacBio RS.
Project description:Here we describe CapTrap-Seq, an experimental workflow designed to address the problem of reduced transcript end detection by long-read RNA sequencing methods, especially at the 5' ends. We apply CapTrap-Seq to profile transcriptomes of the human heart and brain and we compared the obtained results with other library preparation approaches. CapTrap-Seq is a platform-agnostic method and here tested the method by using 3 different long-read sequencing platforms: MinION (ONT), Sequel (PacBaio) and Sequel II (PacBio).
Project description:Zea mays is a leading model for elucidating transcriptional networks in plants, aided by increasingly refined studies of the transcriptome atlas across spatio-temporal, developmental, and environmental dimensions. Limiting this progress are uncertainties about the complete structure mRNA transcripts, particularly with respect to alternatively spliced isoforms. Although second-generation RNA-seq provides a quantitative assay for transcriptional and posttranscriptional events, the accurate reconstruction of full-length mRNA isoforms is challenging with short-read technologies. By producing much longer reads, third generation sequencing offers to solve the assembly problem, but can suffer from lower read accuracy and throughput. Here, we combine these complementary technologies to define and quantify high-confidence transcript isoforms in maize. Six tissues (root, pollen, embryo, endosperm, immature ear, and immature tassel) of the B73 inbred line were used for mRNA sequencing with the Illumina Hiseq2000 PE101 platform to comprehensively quantitate gene/isoform expression. In parallel, intact cDNAs from the same samples were sequenced using the PacBio RS II platform. The latter used six size fractionated libraries (<1kb, 1-2kb, 2-3kb, 3kb-5kb, 4-6kb,>5kb) to generate more than 2 million full length reads. Preliminary findings suggest that mechanisms of alternative splicing are differentially employed between different tissues. In addition, these data show promise to dramatically improve the status of maize genome annotation, with the detection of previously unidentified transcript isoforms, and uncovering previously unrecognized genes. This submission is data of Illumina Hiseq2000 PE101 reads.