Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

0

Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data


ABSTRACT: A large number of computational methods have been recently developed for analyzing differential gene expression (DE) in RNA-seq data. We report on a comprehensive evaluation of the commonly used DE methods using the SEQC benchmark data set and data from ENCODE project. We evaluated a number of key features including: normalization, accuracy of DE detection and DE analysis when one condition has no detectable expression. We found significant differences among the methods. Furthermore, computational methods designed for DE detection from expression array data perform comparably to methods customized for RNA-seq. Most importantly, our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth. The Sequencing Quality Control Consortium generated two datasets from two reference RNA samples in order to evaluate transcriptome profiling by next-generation sequencing technology. Each sample contains one of the reference RNA source and a set of synthetic RNAs from the External RNA Control Consortium (ERCC) at known concentrations. Group A contains 5 replicates of the Strategene Universal Human Reference RNA (UHRR), which is composed of total RNA from 10 human cell lines, with 2% by volume of ERCC mix 1. Group B includes 5 replicate samples of the Ambion Human Brain Reference RNA (HBRR) with 2% by volume of ERCC mix 2. The ERCC spike-in control is a mixture of 92 synthetic polyadenylated oligonucleotides of 250-2000 nucleotides long that are meant to resemble human transcripts.

ORGANISM(S): Homo sapiens

SUBMITTER: Doron Betel 

PROVIDER: E-GEOD-49712 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

altmetric image

Publications

A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.

Nature biotechnology 20140824 9


We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequenci  ...[more]

Similar Datasets

2017-03-08 | E-MTAB-5480 | biostudies-arrayexpress
2020-05-11 | GSE99081 | GEO
2010-04-19 | GSE20579 | GEO
2010-04-19 | GSE20555 | GEO
2022-04-05 | GSE190614 | GEO
2016-05-13 | E-GEOD-75823 | biostudies-arrayexpress
2010-04-19 | E-GEOD-20579 | biostudies-arrayexpress
2021-02-07 | E-MTAB-8426 | biostudies-arrayexpress
2016-05-13 | GSE75823 | GEO
2013-08-20 | GSE49712 | GEO