Ontology highlight
ABSTRACT: Motivation
RNA-seq has become a routine technique in differential expression (DE) identification. Scientists face a number of experimental design decisions, including the sample size. The power for detecting differential expression is affected by several factors, including the fraction of DE genes, distribution of the magnitude of DE, distribution of gene expression level, sequencing coverage and the choice of type I error control. The complexity and flexibility of RNA-seq experiments, the high-throughput nature of transcriptome-wide expression measurements and the unique characteristics of RNA-seq data make the power assessment particularly challenging.Results
We propose prospective power assessment instead of a direct sample size calculation by making assumptions on all of these factors. Our power assessment tool includes two components: (i) a semi-parametric simulation that generates data based on actual RNA-seq experiments with flexible choices on baseline expressions, biological variations and patterns of DE; and (ii) a power assessment component that provides a comprehensive view of power. We introduce the concepts of stratified power and false discovery cost, and demonstrate the usefulness of our method in experimental design (such as sample size and sequencing depth), as well as analysis plan (gene filtering).Availability
The proposed method is implemented in a freely available R software package PROPER.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Wu H
PROVIDER: S-EPMC4287952 | biostudies-literature |
REPOSITORIES: biostudies-literature