Project description:We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long-reads and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from three different tissue types from three other species of squid species (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein coding genes supported by evidence and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome.
Project description:In this study, we aim to present a global transcriptome analysis of medicinal plant, Catharanthus roseus. We generated about 343 million high-quality reads from three tissues (leaf, root and flower) using Illumina platform. We performed an optimized de novo assembly of the reads and estimated transcript abundance in different tissue samples. The transcriptome dynamics was studied by differential gene expression analyses among tissue samples. We collected different tissue samples from the mature plants. Total RNA isolated from these tissue samples was subjected to Illumina sequencing. The sequence data was further filtered using NGS QC Toolkit to obtain high-quality reads. The filtered reads were used for de novo assembly optimization. The reads were further mapped to the Catharanthus transcripts via CLC Genomics Workbench and differential gene expression analysis was performed using DESeq software.
Project description:In this study, we aim to present a global view of transcriptome dynamics in different rice cultivars (IR64, Nagina 22 and Pokkali) under control and stress conditions. More than 50 million high quality reads were obtained for each tissue sample using Illumina platform. Reference-based assembly was performed for each rice cultivar. The transcriptome dynamics was studied by differential gene expression analyses between stress treatment and control sample. We collected seedlings of three rice cultivars subjected to control (kept in water), desiccation (transferred on folds of tissue paper) and salinity (transferred to beaker containing 200 mM NaCl solution) treatments. Total RNA isolated from these tissue samples was subjected to Illumina sequencing. The sequence data was further filtered using NGS QC Toolkit to obtain high-quality reads. The filtered reads were mapped to Japonica reference genome using Tophat software. Cufflinks was used for reference-based assembly and differential gene expression was studied using cuffdiff software. The differentially expressed genes during various abiotic stress conditions were identified.
Project description:In this study, we sequenced small RNA content from seven major tissues/organs employing Illumina technology. More than 154 million reads were generated using Illumina high-throughput sequencing GAII platform, which represented more than 20 million distinct small RNA sequences. After pre-processing, several conserved and novel miRNAs were identified in chickpea. Further, the putative targets of chickpea miRNAs were identified and their functional categorization was analyzed. In addition, we identified miRNAs exhibitng differential and specific expression in various tissues/organs. We collected different tissue samples used in this study and total RNA isolated was subjected to Illumina sequencing. The sequenced data was further filtered using NGS QC Toolkit to obtain high-quality reads. The filtered reads were pre-processed using modified perl script provided in the miRTools software. After quality control, the identical reads were collapsed into a unique read and read count for each sequence was recorded. All the filtered unique reads from each sample were screened stepwise against annotated non-coding RNA sequences, including plant snoRNA, tRNA and rRNA. The remaining reads were screened against repeat sequences from RepBase and chickpea chloroplast sequence. Conserved miRNAs were identified based on similarity with miRBase database and novel miRNAs were identified using miRDeep-P pipeline. For differential expression analysis, the read count for each miRNA was normalized using DESeq software. The genes preferentially and specifically expressed in various tissues/organs were identified.
Project description:We found that thylakoid-anchored protein PBF8 is a key regulator for Photosystem I (PSI) biogenesis. To explore the role of PBF8 in regulating chloroplast gene expression, we performed the RNA-seq to compare the the transcript levels of chloroplast-encoded genes between wild type (Col-0) and pbf8 mutants. To this end, we isolated the total RNA form 12-day-old wild type and pbf8 seedlings grown on the MS medium under long-day conditions (14 h light, 10 h dark) at 22 ºC and with a light intensity of 80 µmol m-2 s-1. The rRNAs were deleted using the Ribo-Zero Kit (Epicentre). The resulting rRNA-depleted RNA was used for preparing the sequencing library with NEBNext Single Cell/Low input library Prep Kit. The libraries were pooled and sequenced on an Illumina Nova 6000 system with 150-bp pair-end reads. Finally, our results show that the transcript accumulation for chloroplast-encoded PSI subunit and assembly factor genes between the wild type (Col-0) and pbf8 samples, suggesting PBF8 may not affect the transcript levels of chloroplast-encoded PSI subunits and assembly factors in chloroplasts.
2024-01-24 | GSE239827 | GEO
Project description:Re-sequencing data for chloroplast genome assembly
Project description:Purpose: miR-Seq was utilised to identify miRNAs which are altered during the course of KSHV lytic replication at 0, 16 and 24 hours post reactivation in TREx-BCBL1-RTA cells. Methods: Virus lytic replication was induced via addition of 2 µg/mL doxycycline hyclate (Sigma-Aldrich). Total RNA was extracted from TREx-BCBL-1s at 0, 16 and 24 hours post lytic induction. Small RNA libraries were prepared using the TruSeq Small RNA Library Prep Kit (Illumina). Quality filtered (Q < 20), and adapter trimmed reads (Trimmomatic v0.39) [59] were aligned to the GRCh38/hg38 assembly of the human genome using Bowtie2 (V 2.4.2).
Project description:The goal of this study is to identify the pathway alterations driving the adaptive resistance to PI3K inhibition in GBM. We generated the resistant cell lines through a patient-derived in vivo glioma sphere-forming cell (GSC) model. We performed RNA-seq on the paired GSC samples including the parental and resistant groups. Libraries were sequenced with an average coverage for each tumor of 50x on the Hiseq4000 platform from Illumina, using 76 nt pair-ended reads. RNA-seq raw data were pre-processed using PRADA. PRADA aligned RNA-seq reads to a composite reference database composed of whole genome and transcriptome sequences; we used the hg19 human genome assembly, together with the Ensembl64 transcriptome version. Transcripts were filtered for size and protein-coding genes. Expression data were normalized to reads per kilobase per million reads, and these values were log2-transformed for further analyses.
Project description:<p>The section <em>Oleifera</em> (Theaceae) has attracted attention for the high levels of unsaturated fatty acids found in its seeds. Here, we report the chromosome-scale genome of the sect. <em>Oleifera</em> using diploid wild <em>Camellia lanceoleosa</em> with a final size of 3.00 Gb and an N50 scaffold size of 186.43 Mb. Repetitive sequences accounted for 80.63% and were distributed unevenly across the genome. <em>Camellia lanceoleosa</em> underwent a whole-genome duplication event approximately 65 million years ago (65 Mya), prior to the divergence of <em>C</em>. <em>lanceoleosa</em> and <em>Camellia sinensis</em> (approx. 6-7 Mya). Syntenic comparisons of these two species elucidated the genomic rearrangement, appearing to be driven in part by the activity of transposable elements. The expanded and positively selected genes in <em>C</em>. <em>lanceoleosa</em> were significantly enriched in oil biosynthesis, and the expansion of homomeric <em>acetyl-coenzyme A carboxylase</em> (<em>ACCase</em>) genes and the seed-biased expression of genes encoding heteromeric ACCase, diacylglycerol acyltransferase, glyceraldehyde-3-phosphate dehydrogenase and stearoyl-ACP desaturase could be of primary importance for the high oil and oleic acid content found in <em>C. lanceoleosa</em>. Theanine and catechins were present in the leaves of <em>C</em>. <em>lanceoleosa</em>. However, caffeine can not be dectected in the leaves but was abundant in the seeds and roots. The functional and transcriptional divergence of genes encoding SAM-dependent <em>N</em>-methyltransferases may be associated with caffeine accumulation and distribution. Gene expression profiles, structural composition and chromosomal location suggest that the late-acting self-incompatibility of <em>C. lanceoleosa</em> is likely to have favoured a novel mechanism co-occurring with gametophytic self-incompatibility. This study provides valuable resources for quantitative and qualitative improvements and genome assembly of polyploid plants in sect. <em>Oleifera</em>.</p>
Project description:Amplicon-based targeted re-sequencing analysis was performed in the patient-derived gliobastoma cell culture samples. For this purpose, genomic DNA (gDNA) was isolated and DNA libraries were prepared using the TruSeq Custom Amplicon Low Input (Illumina, Inc.) technology. By this, a pool of 375 amplicons was generated for each single sample in order to enrich for the target genes ATRX1, EGFR, IDH1, NF1, PDGFRA, PIK3CG, PIK3R1, PTEN, RB1 and TP53. Sequencing was performed on the Illumina MiSeq® next generation sequencing system (Illumina Inc.) and its 2 x 250 bp paired-end v2 read chemistry. The resulting reads were quality controlled and mapped against the human reference genome (hg19). For all samples, sequence variations of the amplified regions of interest in comparison to the human reference sequence were identified and filtered based on reliability.