Dataset Information

A scaling-free minimum enclosing ball method to detect differentially expressed genes for RNA-seq data.

ABSTRACT:

Background

Identifying differentially expressed genes between the same or different species is an urgent demand for biological and medical research. For RNA-seq data, systematic technical effects and different sequencing depths are usually encountered when conducting experiments. Normalization is regarded as an essential step in the discovery of biologically important changes in expression. The present methods usually involve normalization of the data with a scaling factor, followed by detection of significant genes. However, more than one scaling factor may exist because of the complexity of real data. Consequently, methods that normalize data by a single scaling factor may deliver suboptimal performance or may not even work.The development of modern machine learning techniques has provided a new perspective regarding discrimination between differentially expressed (DE) and non-DE genes. However, in reality, the non-DE genes comprise only a small set and may contain housekeeping genes (in same species) or conserved orthologous genes (in different species). Therefore, the process of detecting DE genes can be formulated as a one-class classification problem, where only non-DE genes are observed, while DE genes are completely absent from the training data.

Results

In this study, we transform the problem to an outlier detection problem by treating DE genes as outliers, and we propose a scaling-free minimum enclosing ball (SFMEB) method to construct a smallest possible ball to contain the known non-DE genes in a feature space. The genes outside the minimum enclosing ball can then be naturally considered to be DE genes. Compared with the existing methods, the proposed SFMEB method does not require data normalization, which is particularly attractive when the RNA-seq data include more than one scaling factor. Furthermore, the SFMEB method could be easily extended to different species without normalization.

Conclusions

Simulation studies demonstrate that the SFMEB method works well in a wide range of settings, especially when the data are heterogeneous or biological replicates. Analysis of the real data also supports the conclusion that the SFMEB method outperforms other existing competitors. The R package of the proposed method is available at https://bioconductor.org/packages/MEB .

SUBMITTER: Zhou Y

PROVIDER: S-EPMC8234728 | biostudies-literature | 2021 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A scaling-free minimum enclosing ball method to detect differentially expressed genes for RNA-seq data.

Zhou Yan Y Yang Bin B Wang Junhui J Zhu Jiadi J Tian Guoliang G

BMC genomics 20210626 1

<h4>Background</h4>Identifying differentially expressed genes between the same or different species is an urgent demand for biological and medical research. For RNA-seq data, systematic technical effects and different sequencing depths are usually encountered when conducting experiments. Normalization is regarded as an essential step in the discovery of biologically important changes in expression. The present methods usually involve normalization of the data with a scaling factor, followed by d ...[more]

PMID: 34174824

Dataset Information

A scaling-free minimum enclosing ball method to detect differentially expressed genes for RNA-seq data.

Background

Results

Conclusions

Publications

A scaling-free minimum enclosing ball method to detect differentially expressed genes for RNA-seq data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A parsimonious statistical method to detect groupwise differentially expressed functional connectivity networks.
| S-EPMC4849893 | biostudies-literature

DREAMSeq: An Improved Method for Analyzing Differentially Expressed Genes in RNA-seq Data.
| S-EPMC6284200 | biostudies-literature

A statistical method for detecting differentially expressed SNVs based on next-generation RNA-seq data.
| S-EPMC5151178 | biostudies-literature

DEBKS: A Tool to Detect Differentially Expressed Circular RNAs.
| S-EPMC9801035 | biostudies-literature

Finding differentially expressed sRNA-Seq regions with srnadiff.
| S-EPMC8378736 | biostudies-literature

Adaptive thresholds to detect differentially expressed genes in microarray data.
| S-EPMC3163930 | biostudies-literature

scMEB: a fast and clustering-independent method for detecting differentially expressed genes in single-cell RNA-seq data.
| S-EPMC10210493 | biostudies-literature

MarcoPolo: a method to discover differentially expressed genes in single-cell RNA-seq data without depending on prior clustering.
| S-EPMC9262626 | biostudies-literature

Comparison of methods to detect differentially expressed genes between single-cell populations.
| S-EPMC5862313 | biostudies-literature

A Method Based on Differential Entropy-Like Function for Detecting Differentially Expressed Genes Across Multiple Conditions in RNA-Seq Studies.
| S-EPMC7514722 | biostudies-literature