Project description:Aim: We aim to compare current (MeDIP-seq), new (Illumina Infinium 450K BeadChip) and future (PacBio) methods for whole genome DNA methylation analysis. As the interest in determination of disease methylation profiles increases, the scope, advantages and limitations of these methods requires assessment. There are key questions to answer and specific challenges to overcome. For example, how much detail/resolution is sufficient to identify regions of differential methylation and regions of biological/medical significance within a sample? How much coverage of the genome is required for accurate methylation analysis? Is it important to confirm which regions of the genome are unmethylated in addition to focusing on those that are methylated? Loss of methylation may be of equal importance within the cell since this may also contribute to disease pathogenesis. A multi-method (affinity enrichment/bisulphite-conversion based/direct sequencing of methyl-cytosine) and technology platform (Illumina HiSeq/PacBio/Illumina Infinium BeadChip) comparison will enable us to determine the strengths and weakness of each method. We propose to compare four methods using two DNA samples from the Coriell Institute for Cell Repository to assess both current and future capabilities for whole genome methylation analysis in parallel: A) MeDIP-seq using Illumina HiSeq B) Illumina Infinium HumanMethylation 450K BeadChip and C) whole genome methylation sequencing using PacBio. Existing single molecule deep bisulphite sequencing data generated previously from these same samples at the WTSI for targeted regions (30-40 genes) on the human X chromosome will be used to assess performance of each method. The methods selected for this study will generate data covering a range of resolutions from a whole genome scan to array (target defined) resolution and up to single base pair, single molecule resolution; the highest level of detail possible with methods currently available.Samples: DNA from sibling pair GM01240 (female) and GM01240 (male).Requirements: Both samples will be analysed using;A.MeDIP-seq using Illumina HiSeq (one HiSeq lane, 75bp paired end, per sample) B.Illumina Infinium HumanMethylation 450K BeadChipWe are expecting a potentially unnecessary high coverage using one HiSeq lane per sample. However, for the MeDIP procedure we do not have a multiplexing procedure in place. Our requirements for PacBio sequencing have been discussed with and will be supported by the Sequencing Technology Development group.
Project description:We used PacBio data to identify more reliable transcripts from hESC, based on which we can estimate gene/transcript abundance better from Illumina data. PacBio long reads and Illumina short reads were generated from the same hESC cell line H1. PacBio reads were error-corrected by Illumina reads to identify transcripts. rSeq is used to estimate gene/transcript abundance of the identified transcriptome.
Project description:Microorganisms isolate from samples of Mariana Challenger Trench, like water, sediment or animal samples. Microorganisms with function, or special enzymes were preliminary selected by selected medium. Then, their whole genome sequences (WGS) were sequenced on Illumina, Nanopore Minion, and PacBio SMRT platform.
Project description:The LRGASP challenge encompasses different human, mouse, and manatee samples sequenced using multiple combinations of protocols and platforms. Different challenges will use distinct subsets of the samples for evaluation. The long-read sequencing platforms used in these challenges are the Pacific Biosciences (PacBio) Sequel II, Oxford Nanopore (ONT) MinION and PromethION. Samples will also be sequenced on the Illumina HiSeq 2500. The primary LRGASP library prep protocols are “standard” cDNA sequencing, direct RNA sequencing, R2C2, and CapTrap. Each sample will also include Lexogen SIRV-Set 4 spike-ins. We will also provide simulated PacBio and ONT data as part of the evaluations. This particular study focuses on single strand CAGE sequencing of human iPSCs, defining CAGE peaks from Illumina HiSeq 2500 (SR: 150 cycles) of two biological replicates for use in the LRGASP challenge.
Project description:To be able to study where Tn5 inserts among repeat sequences of the genome, ATAC-seq was performed using a custom insert. The resulting DNA was then mechanically sheared and sequenced using PacBio
Project description:Deregulated gene expression is a hallmark of cancer, however most studies to date have analyzed short-read RNA-sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short read RNA sequencing to comprehensively survey the transcriptome of gastric cancer (GC), a leading cause of global cancer mortality. We performed full-length transcriptome analysis across 10 GC cell lines covering four major GC molecular subtypes (chromosomal unstable, Epstein-Barr positive, genome stable and microsatellite unstable). We identify 60,239 non-redundant full-length transcripts, of which >66% are novel compared to current transcriptome databases. Novel isoforms are more likely to be cell-line and subtype specific, expressed at lower levels with larger number of exons, with longer isoform/coding sequence lengths. Most novel isoforms utilize an alternate first exon, and compared to other alternative splicing categories are expressed at higher levels and exhibit higher variability. Collectively, we observe alternate promoter usage in 25% of detected genes, with the majority (84.2%) of known/novel promoter pairs exhibiting potential changes in their coding sequences. Mapping these alternate promoters to TCGA GC samples, we identify several cancer-associated isoforms, including novel variants of oncogenes. Tumor-specific transcript isoforms tend to alter protein coding sequences to a larger extent than other isoforms. Analysis of outcome data suggests that novel isoforms may impart additional prognostic information. Our results provide a rich resource of full-length transcriptome data for deeper studies of GC and other gastrointestinal malignancies.