Unknown

Dataset Information

0

RnaSeqSampleSize: real data based sample size estimation for RNA sequencing.


ABSTRACT:

Background

One of the most important and often neglected components of a successful RNA sequencing (RNA-Seq) experiment is sample size estimation. A few negative binomial model-based methods have been developed to estimate sample size based on the parameters of a single gene. However, thousands of genes are quantified and tested for differential expression simultaneously in RNA-Seq experiments. Thus, additional issues should be carefully addressed, including the false discovery rate for multiple statistic tests, widely distributed read counts and dispersions for different genes.

Results

To solve these issues, we developed a sample size and power estimation method named RnaSeqSampleSize, based on the distributions of gene average read counts and dispersions estimated from real RNA-seq data. Datasets from previous, similar experiments such as the Cancer Genome Atlas (TCGA) can be used as a point of reference. Read counts and their dispersions were estimated from the reference's distribution; using that information, we estimated and summarized the power and sample size. RnaSeqSampleSize is implemented in R language and can be installed from Bioconductor website. A user friendly web graphic interface is provided at http://cqs.mc.vanderbilt.edu/shiny/RnaSeqSampleSize/ .

Conclusions

RnaSeqSampleSize provides a convenient and powerful way for power and sample size estimation for an RNAseq experiment. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.

SUBMITTER: Zhao S 

PROVIDER: S-EPMC5975570 | biostudies-literature | 2018 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

RnaSeqSampleSize: real data based sample size estimation for RNA sequencing.

Zhao Shilin S   Li Chung-I CI   Guo Yan Y   Sheng Quanhu Q   Shyr Yu Y  

BMC bioinformatics 20180530 1


<h4>Background</h4>One of the most important and often neglected components of a successful RNA sequencing (RNA-Seq) experiment is sample size estimation. A few negative binomial model-based methods have been developed to estimate sample size based on the parameters of a single gene. However, thousands of genes are quantified and tested for differential expression simultaneously in RNA-Seq experiments. Thus, additional issues should be carefully addressed, including the false discovery rate for  ...[more]

Similar Datasets

| S-EPMC3842884 | biostudies-other
| S-EPMC5618549 | biostudies-literature
| S-EPMC4201821 | biostudies-literature
| S-EPMC6047306 | biostudies-other
| S-EPMC4133582 | biostudies-literature
| S-EPMC7925405 | biostudies-literature
| S-EPMC4498680 | biostudies-literature
| S-EPMC2837028 | biostudies-literature
| S-EPMC3654711 | biostudies-literature
| S-EPMC5117187 | biostudies-literature