Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:This data set contains ChEC-seq binding profiles of various TF in yeast strains deleted of other TFs. Each sample has a pair-end sequencing file and a processed file (.out) is a genomic signal track after alignment to S.cerevisiae (R64) reference genome. Mapping was done using the read end. This dataset also contains raw and processed MNase-seq data files for nucleosome occupancy. Data related to manuscript: The architecture of binding cooperativity between densely bound transcription factors.
Project description:Bru-seq nascent RNA sequencing (PubMed ID 23973811) was performed on two primary human fibroblast cell lines, mouse embryonic stem cells, and GM12878 human lymphoblastoid cells. Read data, which include both exon and intron signals, were used to identify transcription unit spans genome-wide, where a transcription unit is roughly correspondent to the longest expressed isoform of a gene. However, because algorithms were not constrained by annotated genes, transcription units need not and often do not correspond precisely to gene boundaries and include extragenic transcription. Transcription units were then compared to separate data sets that comprised induced copy number variants, common fragile sites, and Repli-seq replication timing. The objective was to discover the relationships between transcription unit span and size, local genomic instability, and replication timing. This GEO sample series provides the span and intensity of transcription units called genome-wide in the various samples. Correlations to genome stability and replication timing are provided in the associated manuscript. In addition, one human fibroblast line and the mouse embryonic stem cells had paired samples treated and untreated with low dose aphidicolin. Gene RPKM signal intensities are provided for these samples, although comparing these was not the principal objective of the study. Bru-seq single-read nascent RNA sequencing on human 090 fibroblasts +/- aphidicolin treatment, human UMHF1 fibroblasts (3 replicates), human GM12878 lymphoblastoid cells, and mouse embryonic stem cells+/- aphidicolin treatment.