Dataset Information

Comprehensive Analysis of Large-Scale Transcriptomes from Multiple Cancer Types.

ABSTRACT: Various abnormalities of transcriptional regulation revealed by RNA sequencing (RNA-seq) have been reported in cancers. However, strategies to integrate multi-modal information from RNA-seq, which would help uncover more disease mechanisms, are still limited. Here, we present PipeOne, a cross-platform one-stop analysis workflow for large-scale transcriptome data. It was developed based on Nextflow, a reproducible workflow management system. PipeOne is composed of three modules, data processing and feature matrices construction, disease feature prioritization, and disease subtyping. It first integrates eight different tools to extract different information from RNA-seq data, and then used random forest algorithm to study and stratify patients according to evidences from multiple-modal information. Its application in five cancers (colon, liver, kidney, stomach, or thyroid; total samples n = 2024) identified various dysregulated key features (such as PVT1 expression and ABI3BP alternative splicing) and pathways (especially liver and kidney dysfunction) shared by multiple cancers. Furthermore, we demonstrated clinically-relevant patient subtypes in four of five cancers, with most subtypes characterized by distinct driver somatic mutations, such as TP53, TTN, BRAF, HRAS, MET, KMT2D, and KMT2C mutations. Importantly, these subtyping results were frequently contributed by dysregulated biological processes, such as ribosome biogenesis, RNA binding, and mitochondria functions. PipeOne is efficient and accurate in studying different cancer types to reveal the specificity and cross-cancer contributing factors of each cancer.It could be easily applied to other diseases and is available at GitHub.

SUBMITTER: Nong B

PROVIDER: S-EPMC8701385 | biostudies-literature | 2021 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Comprehensive Analysis of Large-Scale Transcriptomes from Multiple Cancer Types.

Nong Baoting B Guo Mengbiao M Wang Weiwen W Songyang Zhou Z Xiong Yuanyan Y

Genes 20211124 12

Various abnormalities of transcriptional regulation revealed by RNA sequencing (RNA-seq) have been reported in cancers. However, strategies to integrate multi-modal information from RNA-seq, which would help uncover more disease mechanisms, are still limited. Here, we present PipeOne, a cross-platform one-stop analysis workflow for large-scale transcriptome data. It was developed based on Nextflow, a reproducible workflow management system. PipeOne is composed of three modules, data processing a ...[more]

PMID: 34946814

Similar Datasets

Project description:Cilia are dynamic subcellular systems, with core structural and functional components operating in a highly coordinated manner. Since many environmental stimuli sensed by cilia are circadian in nature, it is reasonable to speculate that genes encoding cilia structural and functional components follow rhythmic circadian patterns of expression. Using computational methods and the largest spatiotemporal gene expression atlas of primates, we identified and analyzed the circadian rhythmic expression of cilia genes across 22 primate brain areas. We found that around 73% of cilia transcripts exhibited circadian rhythmicity across at least one of 22 brain regions. In 12 brain regions, cilia transcriptomes were significantly enriched with circadian oscillating transcripts, as compared to the rest of the transcriptome. The phase of the cilia circadian transcripts deviated from the phase of the majority of the background circadian transcripts, and transcripts coding for cilia basal body components accounted for the majority of cilia circadian transcripts. In addition, adjacent or functionally connected brain nuclei had large overlapping complements of circadian cilia genes. Most remarkably, cilia circadian transcripts shared across the basal ganglia nuclei and the prefrontal cortex peaked in these structures in sequential fashion that is similar to the sequential order of activation of the basal ganglia-cortical circuitry in connection with movement coordination, albeit on completely different timescales. These findings support a role for the circadian spatiotemporal orchestration of cilia gene expression in the normal physiology of the basal ganglia-cortical circuit and motor control. Studying orchestrated cilia rhythmicity in the basal ganglia-cortical circuits and other brain circuits may help develop better functional models, and shed light on the causal effects cilia functions have on these circuits and on the regulation of movement and other behaviors.

Dataset Information

Comprehensive Analysis of Large-Scale Transcriptomes from Multiple Cancer Types.

Publications

Comprehensive Analysis of Large-Scale Transcriptomes from Multiple Cancer Types.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets