Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:This data set contains ChEC-seq binding profiles of various TF in yeast strains deleted of other TFs. Each sample has a pair-end sequencing file and a processed file (.out) is a genomic signal track after alignment to S.cerevisiae (R64) reference genome. Mapping was done using the read end. This dataset also contains raw and processed MNase-seq data files for nucleosome occupancy. Data related to manuscript: The architecture of binding cooperativity between densely bound transcription factors.
Project description:Bru-seq nascent RNA sequencing (PubMed ID 23973811) was performed on two primary human fibroblast cell lines, mouse embryonic stem cells, and GM12878 human lymphoblastoid cells. Read data, which include both exon and intron signals, were used to identify transcription unit spans genome-wide, where a transcription unit is roughly correspondent to the longest expressed isoform of a gene. However, because algorithms were not constrained by annotated genes, transcription units need not and often do not correspond precisely to gene boundaries and include extragenic transcription. Transcription units were then compared to separate data sets that comprised induced copy number variants, common fragile sites, and Repli-seq replication timing. The objective was to discover the relationships between transcription unit span and size, local genomic instability, and replication timing. This GEO sample series provides the span and intensity of transcription units called genome-wide in the various samples. Correlations to genome stability and replication timing are provided in the associated manuscript. In addition, one human fibroblast line and the mouse embryonic stem cells had paired samples treated and untreated with low dose aphidicolin. Gene RPKM signal intensities are provided for these samples, although comparing these was not the principal objective of the study. Bru-seq single-read nascent RNA sequencing on human 090 fibroblasts +/- aphidicolin treatment, human UMHF1 fibroblasts (3 replicates), human GM12878 lymphoblastoid cells, and mouse embryonic stem cells+/- aphidicolin treatment.
Project description:Chromatin immunoprecipitation (ChIP) experiments were conducted as previously described (Ito et al, 2013) using anti-H3K4me3 (Millipore, #07-473), anti-H3K4me1 (Abcam, #ab8895), or anti-Kdm5C (Iwase et al., 2016). Hippocampi derived from two different animals were pooled together for each sample and two independent biological replicates per condition were sequenced according to manufacturer instructions in a HiSeq2500 apparatus (Illumina, Inc). Information on library preparation method, size of the libraries, and mapping to reference genome can be found in Supplementary Material accompanying the manuscript. ChIP-seq reads were aligned to the mouse genome (Mus_musculus.GRCm.38.83) using bowtie2 (v2.2.9) (Langmead and Salzberg, 2012) and further processed using samtools (v1.3.1) (Li et al., 2009). Peak calling was performed using MACS2 (v2.1.0) (Zhang et al., 2008) with default parameters except for Kdm5c that were as follows: -q 0.01 --nomodel --extsize 131 --broad --broad-cutoff 0.1. Read counts on aligned bam files were performed using Rsubread (v1.22.3) (Liao et al., 2014). Differential peak methylation analysis for H3K4me3 chromatin mark was performed using DESeq2 (v1.10.0) (Love et al., 2014) of the bioconductor suite (Huber et al., 2015) in the R (v3.3) statistical computing platform. For consideration of differentially methylated regions between conditions, we used adjusted p-value < 0.05 as indicated in the manuscript.
Project description:Cells were treated as described in the material and methods section of the manuscript. Briefly, RNA-seq experiments were performed with sorted populations with an size of about 10_E6 Corynebacterium glutamicum cells. As a reference data set RNA-Seq. analysis of unsorted cells were conducted (n=3, induced vs uninduced =2, 2 x 3 = 6 data sets). For the sorting and the counter-silencer based prophage induction, RNAprotect and RNAlater were used and for each agent data sets with two biological relicates were generated (n = 2, RNAprotect + RNAlater = 2, induced vs uninduced = 2, 2 x 2 x 2 = 8 data sets). One data set was generated using 10_E5 cells (n = 1, induced vs uninduced =2, 2 x 1 = 2 data sets) For the subpopulation caused by iron-shift experiments RNAprotect was used (n = 3, positive vs negative, 3 x 2 = 6 data sets). In sum 26 data sets are uploaded. Each consist of two fastq files (due to paired-end sequencing).