Project description:long-read CAGE was design to identify full length capped transcript across 10 specific loci in cortical neurones. Long-read CAGE was based on the Cap-Trapper method with the full length cDNA sequencing using ONT MinION sequencer. After RNA extraction, 10 µg total RNAs from Human iPS (WTC-11) cells, differentiated neural stem cells and differentiated cortical neuron cells were polyadenylated with E-coli poly(A) Polymerase (PAP) (NEB M0276) at 37°C for 15 min and purified with AMPure RNA Clean XP beads. The PAP treated 5 µg RNA was reverse transcribed with oligodT_16VN_UMI25_primer (GAGATGTCTCGTGGGCTCGGNNNNNNNNNNNNNNNNNNNNNNNNNCTACGTTTTTTTTTTTTTTTTVN) and Prime Script II Reverse Transcriptase (Takara Bio) at 42°C for 60 min and purified with RNAClean XP beads. Cap-trapping from the RNA/cDNA hybrids was performed with published protocol (Takahashi et al., Nature protocols, 2012 (https://doi.org/10.1038/nprot.2012.005)), and RNA was digested with RNase H (Takara Bio) at 37°C for 30 min and purified with AMPureXP beads. 5’ linker (N6 up GTGGTATCAACGCAGAGTACNNNNNN-Phos, GN5 up GTGGTATCAACGCAGAGTACGNNNNN-Phos, down Phos-GTACTCTGCGTTGATACCAC-Phos) was ligated to the cDNA with Mighty Mix (Takara Bio) for overnight and the ligated cDNA was purified with AMPure XP beads. Shrimp Alkaline Phosphatase (Takara Bio) was used to remove phosphates at the ligated linker and purified with AMPureXP beads. The 5’ linker ligated cDNA was then second strand synthesized with KAPA HiFi mix (Roche) and 2nd synthesis primer_UMI15 at 95°C for 5 min, 55°C for 5 min and 72°C for 30 min. Exonuclease I (Takara Bio) was added for the primer digestion at 37°C for 30 min, and the cDNA/DNA hybrid was purified with AMPureXP and amplified with PrimerSTAR GXL DNA polymerase (Takara Bio) and PCR primer (fwd_CTACACTCGTCGGCAGCGTC, rev _GAGATGTCTCGTGGGCTCGG) for 7 cycles. The library was then treated with SQK-LSK110 (Oxford Nanopore Technologies) with manufacture’s protocol and sequenced with R9.4 flowcell (FLO-MIN106) in MinION sequencer. Basecalling was processed by Guppy v5.0.14 basecaller software provided by Oxford Nanopore Technologies to generate fastq files from FAST5 files. To prepare clean reads from fastq files, adapter sequence was trimmed by pychopper (https://github.com/nanoporetech/pychopper) with VNP_GAGATGTCTCGTGGGCTCGGNNNNNNNNNNNNNNNCTACG and SSP_ CTACACTCGTCGGCAGCGTCNNNNNNNNNNNNNNNNNNNNNNNNNGTGGTATCAACGCAGAGTAC and the fastq was mapped on our target genes.
Project description:The methylation landscape of the cattle Y-chromosome was characterized using methylated cytosine data produced from PacBio and ONT long reads sequencing platforms.
Project description:Rapidly increased studies by third-generation sequencing [Pacific Biosciences (Pacbio) and Oxford Nanopore Technologies (ONT)] have been used in all kinds of research areas. Among them, the plant full-length single-molecule transcriptome studies were most used by Pacbio while ONT was rarely used. Therefore, in this study, we developed ONT RNA-sequencing methods in plants. We performed a detailed evaluation of reads from Pacbio and Nanopore PCR cDNA (ONT Pc) sequencing in plants (Arabidopsis), including the characteristics of raw data and identification of transcripts. We aimed to provide a valuable reference for applications of ONT in plant transcriptome analysis.
Project description:The methylation landscape of the sheep Y-chromosome was characterized using methylated cytosine data produced from PacBio and ONT long reads sequencing platforms. The study aimed to corroborate the presumptive locus of the sheep Y-chromosome centromere.
Project description:Alternative splicing is widely acknowledged to be a crucial regulator of gene expression and is a key contributor to both normal developmental processes and disease states. While cost-effective and accurate for quantification, short-read RNA-seq lacks the ability to resolve full-length transcript isoforms despite increasingly sophisticated computational methods. Long-read sequencing platforms such as Pacific Biosciences (PacBio) and Oxford Nanopore (ONT) bypass the transcript reconstruction challenges of short-reads. Here we describe TALON, the ENCODE4 pipeline for analyzing PacBio cDNA and ONT direct-RNA transcriptomes. We apply TALON to three human ENCODE Tier 1 cell lines and show that while both technologies perform well at full-transcript discovery and quantification, each technology has its distinct artifacts. We further apply TALON to mouse cortical and hippocampal transcriptomes and find that a substantial proportion of neuronal genes have more reads associated with novel isoforms than annotated ones. The TALON pipeline for technology-agnostic, long-read transcriptome discovery and quantification tracks both known and novel transcript models as well as expression levels across datasets for both simple studies and larger projects such as ENCODE that seek to decode transcriptional regulation in the human and mouse genomes to predict more accurate expression levels of genes and transcripts than possible with short-reads alone.
Project description:Alternative splicing is widely acknowledged to be a crucial regulator of gene expression and is a key contributor to both normal developmental processes and disease states. While cost-effective and accurate for quantification, short-read RNA-seq lacks the ability to resolve full-length transcript isoforms despite increasingly sophisticated computational methods. Long-read sequencing platforms such as Pacific Biosciences (PacBio) and Oxford Nanopore (ONT) bypass the transcript reconstruction challenges of short-reads. Here we describe TALON, the ENCODE4 pipeline for analyzing PacBio cDNA and ONT direct-RNA transcriptomes. We apply TALON to three human ENCODE Tier 1 cell lines and show that while both technologies perform well at full-transcript discovery and quantification, each one displayed distinct artifacts. We further apply TALON to mouse cortical and hippocampal transcriptomes and find that a substantial proportion of neuronal genes have more reads associated with novel isoforms than with annotated ones. These data show that TALON is a technology-agnostic long-read transcriptome discovery and quantification pipeline capable of tracking both known and novel transcript models, as well as their expression levels, across datasets for both simple studies and in larger projects. These properties will enable TALON users to move beyond the limitations of short-read data to perform isoform discovery and quantification in a uniform manner on existing and future long-read platforms.
Project description:We performed whole-genome sequencing of 8 trios with intellectual disability using Pacbio HiFi long-reads. We then called variants with PBSV and Deepvariant. We filtered for high-quality de novo variants and validated them.
Project description:Purpose: To generate a reference long-read transcriptomic data set for use in developing new analysis pipelines and comparing their performance with existing methods. Synthetic “sequin” RNA standards (Hardwick et al. 2016) were sequenced using the Oxford Nanopore Technologies (ONT) GridION platform.