Project description:Gene fusions and chimeric transcripts occur frequently in cancers and in some cases drive the development of the disease. An accurate detection of these events is crucial for cancer research and in a long-term perspective could be applied for personalized therapy. RNA-seq technology has been established as an efficient approach to investigate transcriptomes and search for gene fusions and chimeric transcripts on a genome-wide scale. A number of computational methods for the detection of gene fusions from RNA-seq data have been developed. However, recent studies demonstrate differences between commonly used approaches in terms of specificity and sensitivity. Moreover their ability to detect gene fusions on the isoform level has not been studied carefully so far. Here we propose a novel computational approach called InFusion for fusion gene detection from deep RNA sequencing data. Validation of InFusion on simulated and on several public RNA-seq datasets demonstrated better detection accuracy compared to other tools. We also performed deep RNA sequencing of two well-established prostate cancer cell lines. Using these data we showed that InFusion is capable of discovering alternatively spliced gene fusion isoforms as well as chimeric transcripts that include non-exonic regions. In addition our method can detect anti-sense transcription in the fusions by incorporating strand specificity of the sequencing library.
Project description:Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs) and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter, (226 gene panel) and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlation of >0.94 and >0.80 with NanoString and ScriptSeq protocols respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes) and ScriptSeq whole transcriptome protocols respectively. Specifically for lincRNAs, we observed superb Pearson correlation (0.988) between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads). Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transciptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol performed particularly well for lincRNA expression from FFPE libraries but detection of eSNV and fusion transcripts was less sensitive. We performed RNASeq on RNA from nine matched pairs of fresh-frozen and FFPE tissues from breast cancer patients. The goal was to test the RiboZeroGold ScriptSeq complete low input library preparation kit for degraded RNA samples.
Project description:We have utilized our custom array GPL11054 to screen a set of sarcoma patient samples for fusion genes One negative control was included in the study. We wanted to investigate the power of our fusion-gene array GPL11054 as a means for fusion gene detection in a series of sarcoma patient samples from two diagnostic laboratories with expertise in fusion gene detection.
Project description:Pediatric AML is an aggressive hematological malignancy associated with distinctive genomic features. We employed RNA-seq to study fusion genes and clinically relevant gene expression patterns in pediatric AML patients.
Project description:<p>We used massively parallel sequencing technology to profile the genomic DNA and RNA of tumor cells (leukemic bone marrow) and normal cells (skin biopsy) obtained from a single patient with Acute Lymphoblastic Leukemia (ALL), referred to throughout this study as 'ALL1'. Included in this study are samples obtained from a primary tumor, first relapse, second relapse and several intermediate timepoints. We identified somatic mutations present in each tumor by analysis of whole genome (WGS) and exome sequence data. Single nucleotide variants (SNVs) and small insertions and deletions were identified in both the exome and WGS data. Large copy number variations (CNVs) and structural variants (SVs) were identified in the WGS data. A custom capture reagent was designed to target most variants and used to generate deep validation sequence data. The expression status of all somatic variants was assessed by RNA-seq. The RNA-seq data was also used for gene expression analysis and gene fusion detection.</p>
Project description:Despite the ever-increasing speed of detecting fusion transcripts in cancer, it remains formidable to predict what unreported RNA pairs can form new fusion transcripts. By systematic mapping of chromatin-associated RNAs (caRNAs) and their respective genomic interaction loci, we obtained genome-wide RNA-DNA interaction maps from two non-cancerous cell types. The gene pairs involved in RNA-DNA interactions in these normal cells exhibited strong overlap with those with cancer-derived fusion transcripts. These data suggest an RNA-poise model, where the spatial proximity of one gene’s transcripts and the other gene’s genomic sequence poises for the creation of fusion transcripts. We validated this model with 96 additional lung cancer samples. One of these additional samples exhibited fusion transcripts without a corresponding fusion gene, suggesting that genome-recombination is not a required step of the RNA-poise model.
Project description:Despite the ever-increasing speed of detecting fusion transcripts in cancer, it remains formidable to predict what unreported RNA pairs can form new fusion transcripts. By systematic mapping of chromatin-associated RNAs (caRNAs) and their respective genomic interaction loci, we obtained genome-wide RNA-DNA interaction maps from two non-cancerous cell types. The gene pairs involved in RNA-DNA interactions in these normal cells exhibited strong overlap with those with cancer-derived fusion transcripts. These data suggest an RNA-poise model, where the spatial proximity of one gene’s transcripts and the other gene’s genomic sequence poises for the creation of fusion transcripts. We validated this model with 96 additional lung cancer samples. One of these additional samples exhibited fusion transcripts without a corresponding fusion gene, suggesting that genome-recombination is not a required step of the RNA-poise model.
Project description:Cell lines have been essential for major discoveries in cancer including Ewing sarcoma (EwS). EwS is a highly aggressive pediatric bone or soft-tissue cancer characterized by oncogenic EWSR1-ETS fusion transcription factors converting polymorphic GGAA-microsatellites (mSats) into neo-enhancers. However, further detailed mechanistic evaluation of gene regulation in EwS have been hindered by the limited number of well-characterized cell line models. Here, we present the Ewing Sarcoma Cell Line Atlas (ESCLA) comprising 18 EwS cell lines with inducible EWSR1-ETS knockdown that were profiled by whole-genome-sequencing, DNA methylation arrays, gene expression and splicing arrays, mass spectrometry, and ChIP-seq for EWSR1-ETS and histone marks. Systematic analysis of these multi-dimensional data identified GATA2 and E2F2 as EWSR1-ETS-driven putative co-regulatory transcription factors. To evaluate the relevance of these transcrition factors, RNA interference in a Ewing sarcoma cell line was employed with subsequent Affymetrix Exon array profiling. The expression data were analysed in synopsis with the effects of EWSR1-FLI1 in the same cell line.
Project description:We report the gene expression profile of 8 metastatic castration resisistant prostate cancer samples analyzed by paired-end RNA-seq. We found evidence of extensive abnormal splicing as well as several novel fusion genes. Finally, we also observed several recurrent high-confidence somatic mutations. Paired-end RNA-seq by rRNA depletion