Project description:We used PacBio data to identify more reliable transcripts from hESC, based on which we can estimate gene/transcript abundance better from Illumina data. PacBio long reads and Illumina short reads were generated from the same hESC cell line H1. PacBio reads were error-corrected by Illumina reads to identify transcripts. rSeq is used to estimate gene/transcript abundance of the identified transcriptome.
Project description:PacBio SMRTseq long reads and Illumina short reads of pig testis, epididymis, vesicular gland, prostate gland, and bulbourethral gland
Project description:The methylation landscape of the cattle Y-chromosome was characterized using methylated cytosine data produced from PacBio and ONT long reads sequencing platforms.
Project description:The methylation landscape of the sheep Y-chromosome was characterized using methylated cytosine data produced from PacBio and ONT long reads sequencing platforms. The study aimed to corroborate the presumptive locus of the sheep Y-chromosome centromere.
Project description:Chromatin immunoprecipitation analysis of CENH3 in the Arabidopsis thaliana accessions Col-0, Ler-0, Cvi-0 and Tanz-1 was performed in order to align reads to PacBio HiFi genome assemblies which contain complete centromere repeat arrays.
Project description:The LRGASP challenge encompasses different human, mouse, and manatee samples sequenced using multiple combinations of protocols and platforms. Different challenges will use distinct subsets of the samples for evaluation. The long-read sequencing platforms used in these challenges are the Pacific Biosciences (PacBio) Sequel II, Oxford Nanopore (ONT) MinION and PromethION. Samples will also be sequenced on the Illumina HiSeq 2500. The primary LRGASP library prep protocols are “standard” cDNA sequencing, direct RNA sequencing, R2C2, and CapTrap. Each sample will also include Lexogen SIRV-Set 4 spike-ins. We will also provide simulated PacBio and ONT data as part of the evaluations. This particular study focuses on single strand CAGE sequencing of human iPSCs, defining CAGE peaks from Illumina HiSeq 2500 (SR: 150 cycles) of two biological replicates for use in the LRGASP challenge.
Project description:Since short reads from Illumina RNA-seq data are challenging to map to repetitive elements , we wanted to confirm the bulk RNA-seq findings using an orthogonal method, namely, using the long read technology of Pacific Biosciences (PacBio) full-length transcriptome sequencing. This dataset provided around 1.1 (WT) and 1.3 (RBM4 KO) million sequence reads of 2.6 kb average length mapping to the human genome.