Unknown

Dataset Information

0

GeneSetCluster: a tool for summarizing and integrating gene-set analysis results.


ABSTRACT:

Background

Gene-set analysis tools, which make use of curated sets of molecules grouped based on their shared functions, aim to identify which gene-sets are over-represented in the set of features that have been associated with a given trait of interest. Such tools are frequently used in gene-centric approaches derived from RNA-sequencing or microarrays such as Ingenuity or GSEA, but they have also been adapted for interval-based analysis derived from DNA methylation or ChIP/ATAC-sequencing. Gene-set analysis tools return, as a result, a list of significant gene-sets. However, while these results are useful for the researcher in the identification of major biological insights, they may be complex to interpret because many gene-sets have largely overlapping gene contents. Additionally, in many cases the result of gene-set analysis consists of a large number of gene-sets making it complicated to identify the major biological insights.

Results

We present GeneSetCluster, a novel approach which allows clustering of identified gene-sets, from one or multiple experiments and/or tools, based on shared genes. GeneSetCluster calculates a distance score based on overlapping gene content, which is then used to cluster them together and as a result, GeneSetCluster identifies groups of gene-sets with similar gene-set definitions (i.e. gene content). These groups of gene-sets can aid the researcher to focus on such groups for biological interpretations.

Conclusions

GeneSetCluster is a novel approach for grouping together post gene-set analysis results based on overlapping gene content. GeneSetCluster is implemented as a package in R. The package and the vignette can be downloaded at https://github.com/TranslationalBioinformaticsUnit.

SUBMITTER: Ewing E 

PROVIDER: S-EPMC7542881 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

GeneSetCluster: a tool for summarizing and integrating gene-set analysis results.

Ewing Ewoud E   Planell-Picola Nuria N   Jagodic Maja M   Gomez-Cabrero David D  

BMC bioinformatics 20201007 1


<h4>Background</h4>Gene-set analysis tools, which make use of curated sets of molecules grouped based on their shared functions, aim to identify which gene-sets are over-represented in the set of features that have been associated with a given trait of interest. Such tools are frequently used in gene-centric approaches derived from RNA-sequencing or microarrays such as Ingenuity or GSEA, but they have also been adapted for interval-based analysis derived from DNA methylation or ChIP/ATAC-sequenc  ...[more]

Similar Datasets

| S-EPMC8059024 | biostudies-literature
| S-EPMC3834824 | biostudies-other
| S-EPMC2852214 | biostudies-literature
| S-EPMC3622641 | biostudies-literature
| S-EPMC8121311 | biostudies-literature
| S-EPMC6692773 | biostudies-literature
| S-EPMC5351668 | biostudies-literature
| S-EPMC9189359 | biostudies-literature
| S-EPMC6876092 | biostudies-literature
| S-EPMC6031048 | biostudies-literature