Project description:Whole exome sequencing of 5 HCLc tumor-germline pairs. Genomic DNA from HCLc tumor cells and T-cells for germline was used. Whole exome enrichment was performed with either Agilent SureSelect (50Mb, samples S3G/T, S5G/T, S9G/T) or Roche Nimblegen (44.1Mb, samples S4G/T and S6G/T). The resulting exome libraries were sequenced on the Illumina HiSeq platform with paired-end 100bp reads to an average depth of 120-134x. Bam files were generated using NovoalignMPI (v3.0) to align the raw fastq files to the reference genome sequence (hg19) and picard tools (v1.34) to flag duplicate reads (optical or pcr), unmapped reads, reads mapping to more than one location, and reads failing vendor QC.
Project description:We attempted to characterize the transcriptome of the chicken embryo during Newcastle disease virus (NDV) infection using RNA-sequencing analysis. The cDNAs derived from Total RNA of the pooled visceral tissues infected with F48E9 or La Sota were sequenced and analysed. The collected clean reads covered about 4.02% (2,341,868 reads) of the entire F48E8 reference sequence, while only 0.02% reads (13,886) were mapped to the La Sota genome. RNA-Seq datasets from groups La Sota, F48E9 and control, were respectively mapped to 71.76%, 68.55% and 70.05% of the reference genome Galgal 4.73. Compared with the control, 2,035 and 1,604 differentially expressed genes of hosts were found responding to F48E9 and La Sota infection, respectively. GO and KEGG pathway enriched various signalling pathways with elements playing roles in enhancing or preventing viral infection, like IFP35, NMI, Mx, OAS*A, IFITM5, STAT1 and IFNβ. So far, we know that velogenic NDV made far more transcripts during infection and caused significant impact on the host, showing a large number of genes in various pathways at high levels of expression.
Project description:Usually, unmapped reads have been considered as useless and been trashed or ignored. Here, we develop a strategy to mining the full length sequence by unmapped reads combining with specific reverse transcription primers design and high throughput sequencing. In this study, we salvage 36 unmapped reads from standard RNA-Seq data(GSM3188619) and randomly select one 149 bp read as a model(CTGGTGCCATAATTCAGGGAACTGTGTTCTTGATGTACTATCTGAGACATTTGTGCTTCCCCCCATCCAGCTATCAGGCTGTTAGGCAATGCACTTCTAGGAATTAGAATTCTATAAGGAATCTCATGCTGGAAGAACAAAAAGACCCA ). Specific reverse transcription primers(5' end:CTGGTGCCATAATTCAGGGA, 3' end:GGATCTTCACGTAACGGATTGT) are designed to amplify its both ends, followed by next generation sequencing. Then we use a statistical model base on power law distribution to estimate its integrality and significance. Further, we validate it by Sanger sequencing. The result shows that the full length is 1,556 bp, with InDel mutation in microsatellite structure. This would be a useful strategy to extract the sequences information from the unmapped RNA-seq data.
Project description:Evaluation of modulation of the innate immune response during H1N1 infection. The modulatory effect of Single-stranded oligonucleotides (ssON) on monocyte-derived dendritic cells (MoDCs) are evaluated. RNAseq data are used to study the effect on the transcriptome of MoDCs, during infection with simultaneous addition of ssON. Further mechanistic information are added via RNAseq data on poly I:C stimulated MoDCs (Toll-Like Receptor 3 agonist). Control samples are included to perform differential expression analysis. Provided are the fastq files, obtained in the following manner: The RNA sequencing was performed with the TruSeq RiboZero kit from Illumina, 25 M reads per sample and 2x125bp. Read quality were assessed using FastQC (Version 0.11.5) Trim Galore (Version 0.3.6) was used for adapter removal and quality trimming with a quality threshold of 20 on the Phred scale. Count files was created out of the trimmed fastq by mapping high-quality reads to Homo sapiens UCSC hg38 (GRCh38.77) reference genome using STAR aligner (version 2.5) with default values and the parameter out Reads Unmapped set to Fastx in order to extract the unmapped reads. After STAR alignment, the count data for the aligned reads were generated with HTSeq-count (version 0.6.1). The-m parameter was set to union.
Project description:LiBis is a novel method for low-input WGBS data alignment. By dynamically clipping initially unmapped reads and remapping clipped fragments, we judiciously rescued those reads and uniquely aligned them to the genome. By substantially increasing the mapping ratio by up to 88%, LiBis improves the number of informative CpGs and the precision to quantify the methylation status of individual CpG sites. The high sensitivity and cost effectiveness afforded by LiBis for low-input samples will allow the discovery of genetic and epigenetic features suitable for downstream analysis and biomarker identification using liquid biopsy.