DiPhiSeq: robust comparison of expression levels on RNA-Seq data with large sample sizes.
Ontology highlight
ABSTRACT: MOTIVATION:In the analysis of RNA-Seq data, detecting differentially expressed (DE) genes has been a hot research area in recent years and many methods have been proposed. DE genes show different average expression levels in different sample groups, and thus can be important biological markers. While generally very successful, these methods need to be further tailored and improved for cancerous data, which often features quite diverse expression in the samples from the cancer group, and this diversity is much larger than that in the control group. RESULTS:We propose a statistical method that can detect not only genes that show different average expressions, but also genes that show different diversities of expressions in different groups. These 'differentially dispersed' genes can be important clinical markers. Our method uses a redescending penalty on the quasi-likelihood function, and thus has superior robustness against outliers and other noise. Simulations and real data analysis demonstrate that DiPhiSeq outperforms existing methods in the presence of outliers, and identifies unique sets of genes. AVAILABILITY AND IMPLEMENTATION:DiPhiSeq is publicly available as an R package on CRAN: https://cran.r-project.org/package=DiPhiSeq. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
SUBMITTER: Li J
PROVIDER: S-EPMC6596898 | biostudies-literature | 2019 Jul
REPOSITORIES: biostudies-literature
ACCESS DATA