Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data
Ontology highlight
ABSTRACT: A large number of computational methods have been recently developed for analyzing differential gene expression (DE) in RNA-seq data. We report on a comprehensive evaluation of the commonly used DE methods using the SEQC benchmark data set and data from ENCODE project. We evaluated a number of key features including: normalization, accuracy of DE detection and DE analysis when one condition has no detectable expression. We found significant differences among the methods. Furthermore, computational methods designed for DE detection from expression array data perform comparably to methods customized for RNA-seq. Most importantly, our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth. The Sequencing Quality Control Consortium generated two datasets from two reference RNA samples in order to evaluate transcriptome profiling by next-generation sequencing technology. Each sample contains one of the reference RNA source and a set of synthetic RNAs from the External RNA Control Consortium (ERCC) at known concentrations. Group A contains 5 replicates of the Strategene Universal Human Reference RNA (UHRR), which is composed of total RNA from 10 human cell lines, with 2% by volume of ERCC mix 1. Group B includes 5 replicate samples of the Ambion Human Brain Reference RNA (HBRR) with 2% by volume of ERCC mix 2. The ERCC spike-in control is a mixture of 92 synthetic polyadenylated oligonucleotides of 250-2000 nucleotides long that are meant to resemble human transcripts.
ORGANISM(S): Homo sapiens
SUBMITTER: Doron Betel
PROVIDER: E-GEOD-49712 | biostudies-arrayexpress |
REPOSITORIES: biostudies-arrayexpress
ACCESS DATA