Dataset Information

Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.

ABSTRACT: This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.

SUBMITTER: Mi G

PROVIDER: S-EPMC4365073 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.

Mi Gu G Di Yanming Y Schafer Daniel W DW

PloS one 20150318 3

This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the tr ...[more]

PMID: 25787144

Dataset Information

Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.

Publications

Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Higher order asymptotics for negative binomial regression inferences from RNA-sequencing data.
| S-EPMC3683603 | biostudies-literature

Goodness-of-fit diagnostics for Bayesian hierarchical models.
| S-EPMC3276744 | biostudies-literature

Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data.
| S-EPMC7487589 | biostudies-literature

Goodness of fit tests for random multigraph models.
| S-EPMC10631392 | biostudies-literature

Pearson's goodness-of-fit tests for sparse distributions.
| S-EPMC10062227 | biostudies-literature

Goodness of Fit Tests for Linear Mixed Models.
| S-EPMC5426279 | biostudies-literature

A goodness-of-fit association test for whole genome sequencing data.
| S-EPMC4143767 | biostudies-literature

Sequence count data are poorly fit by the negative binomial distribution.
| S-EPMC7192467 | biostudies-literature

NONPARAMETRIC GOODNESS-OF-FIT TESTS FOR UNIFORM STOCHASTIC ORDERING.
| S-EPMC5771311 | biostudies-literature

Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression.
| S-EPMC6927181 | biostudies-literature