Dataset Information

PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments.

ABSTRACT: Publicly available RNA-seq data is routinely used for retrospective analysis to elucidate new biology. Novel transcript discovery enabled by joint analysis of large collections of RNA-seq data sets has emerged as one such analysis. Current methods for transcript discovery rely on a '2-Step' approach where the first step encompasses building transcripts from individual data sets, followed by the second step that merges predicted transcripts across data sets. To increase the power of transcript discovery from large collections of RNA-seq data sets, we developed a novel '1-Step' approach named Pooling RNA-seq and Assembling Models (PRAM) that builds transcript models from pooled RNA-seq data sets. We demonstrate in a computational benchmark that 1-Step outperforms 2-Step approaches in predicting overall transcript structures and individual splice junctions, while performing competitively in detecting exonic nucleotides. Applying PRAM to 30 human ENCODE RNA-seq data sets identified unannotated transcripts with epigenetic and RAMPAGE signatures similar to those of recently annotated transcripts. In a case study, we discovered and experimentally validated new transcripts through the application of PRAM to mouse hematopoietic RNA-seq data sets. We uncovered new transcripts that share a differential expression pattern with a neighboring gene Pik3cg implicated in human hematopoietic phenotypes, and we provided evidence for the conservation of this relationship in human. PRAM is implemented as an R/Bioconductor package.

SUBMITTER: Liu P

PROVIDER: S-EPMC7605252 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:We systematically compared the adverse effects of cancer drugs to detect event outliers across different clinical trials using a data-driven approach. Because many cancer drugs are toxic to patients, better understanding of adverse events of cancer drugs is critical for developing therapies that could minimize the toxic effects. However, due to the large variabilities of adverse events across different cancer drugs, methods to efficiently compare adverse effects across different cancer drugs are lacking. To address this challenge, we present an exploration study that integrates multiple adverse event reports from clinical trials in order to systematically compare adverse events across different cancer drugs. To demonstrate our methods, we first collected data on 186,339 clinical trials from ClinicalTrials.gov and selected 30 common cancer drugs. We identified 1602 cancer trials that studied the selected cancer drugs. Our methods effectively extracted 12,922 distinct adverse events from the clinical trial reports. Using the extracted data, we ranked all 12,922 adverse events based on their prevalence in the clinical trials, such as nausea 82%, fatigue 77%, and vomiting 75.97%. To detect the significant drug outliers that could have a statistically high possibility of causing an event, we used the boxplot method to visualize adverse event outliers across different drugs and applied Grubbs' test to evaluate the significance. Analyses showed that by systematically integrating cross-trial data from multiple clinical trial reports, adverse event outliers associated with cancer drugs can be detected. The method was demonstrated by detecting the following four statistically significant adverse event cases: the association of the drug axitinib with hypertension (Grubbs' test, P < 0.001), the association of the drug imatinib with muscle spasm (P < 0.001), the association of the drug vorinostat with deep vein thrombosis (P < 0.001), and the association of the drug afatinib with paronychia (P < 0.01).

Dataset Information

PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets