Unknown

Dataset Information

0

Shrinkage estimation of dispersion in Negative Binomial models for RNA-seq experiments with small sample size.


ABSTRACT:

Motivation

RNA-seq experiments produce digital counts of reads that are affected by both biological and technical variation. To distinguish the systematic changes in expression between conditions from noise, the counts are frequently modeled by the Negative Binomial distribution. However, in experiments with small sample size, the per-gene estimates of the dispersion parameter are unreliable.

Method

We propose a simple and effective approach for estimating the dispersions. First, we obtain the initial estimates for each gene using the method of moments. Second, the estimates are regularized, i.e. shrunk towards a common value that minimizes the average squared difference between the initial estimates and the shrinkage estimates. The approach does not require extra modeling assumptions, is easy to compute and is compatible with the exact test of differential expression.

Results

We evaluated the proposed approach using 10 simulated and experimental datasets and compared its performance with that of currently popular packages edgeR, DESeq, baySeq, BBSeq and SAMseq. For these datasets, sSeq performed favorably for experiments with small sample size in sensitivity, specificity and computational time.

Availability

http://www.stat.purdue.edu/?ovitek/Software.html and Bioconductor.

Contact

ovitek@purdue.edu

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Yu D 

PROVIDER: S-EPMC3654711 | biostudies-literature | 2013 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Shrinkage estimation of dispersion in Negative Binomial models for RNA-seq experiments with small sample size.

Yu Danni D   Huber Wolfgang W   Vitek Olga O  

Bioinformatics (Oxford, England) 20130414 10


<h4>Motivation</h4>RNA-seq experiments produce digital counts of reads that are affected by both biological and technical variation. To distinguish the systematic changes in expression between conditions from noise, the counts are frequently modeled by the Negative Binomial distribution. However, in experiments with small sample size, the per-gene estimates of the dispersion parameter are unreliable.<h4>Method</h4>We propose a simple and effective approach for estimating the dispersions. First,  ...[more]

Similar Datasets

| S-EPMC3310835 | biostudies-literature
| S-EPMC7974632 | biostudies-literature
| S-EPMC9314673 | biostudies-literature
| S-EPMC8052637 | biostudies-literature
| S-EPMC4201821 | biostudies-literature
| S-EPMC3590028 | biostudies-literature
| S-EPMC8026952 | biostudies-literature
| S-EPMC7195715 | biostudies-literature
| S-EPMC5022247 | biostudies-literature
| S-EPMC3590927 | biostudies-literature