Unknown

Dataset Information

0

Should we abandon the t-test in the analysis of gene expression microarray data: a comparison of variance modeling strategies.


ABSTRACT: High-throughput post-genomic studies are now routinely and promisingly investigated in biological and biomedical research. The main statistical approach to select genes differentially expressed between two groups is to apply a t-test, which is subject of criticism in the literature. Numerous alternatives have been developed based on different and innovative variance modeling strategies. However, a critical issue is that selecting a different test usually leads to a different gene list. In this context and given the current tendency to apply the t-test, identifying the most efficient approach in practice remains crucial. To provide elements to answer, we conduct a comparison of eight tests representative of variance modeling strategies in gene expression data: Welch's t-test, ANOVA [1], Wilcoxon's test, SAM [2], RVM [3], limma [4], VarMixt [5] and SMVar [6]. Our comparison process relies on four steps (gene list analysis, simulations, spike-in data and re-sampling) to formulate comprehensive and robust conclusions about test performance, in terms of statistical power, false-positive rate, execution time and ease of use. Our results raise concerns about the ability of some methods to control the expected number of false positives at a desirable level. Besides, two tests (limma and VarMixt) show significant improvement compared to the t-test, in particular to deal with small sample sizes. In addition limma presents several practical advantages, so we advocate its application to analyze gene expression data.

SUBMITTER: Jeanmougin M 

PROVIDER: S-EPMC2933223 | biostudies-literature | 2010 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Should we abandon the t-test in the analysis of gene expression microarray data: a comparison of variance modeling strategies.

Jeanmougin Marine M   de Reynies Aurelien A   Marisa Laetitia L   Paccard Caroline C   Nuel Gregory G   Guedj Mickael M  

PloS one 20100903 9


High-throughput post-genomic studies are now routinely and promisingly investigated in biological and biomedical research. The main statistical approach to select genes differentially expressed between two groups is to apply a t-test, which is subject of criticism in the literature. Numerous alternatives have been developed based on different and innovative variance modeling strategies. However, a critical issue is that selecting a different test usually leads to a different gene list. In this c  ...[more]

Similar Datasets

| S-EPMC2367561 | biostudies-literature
| S-EPMC3268235 | biostudies-literature
| S-EPMC2583934 | biostudies-literature
| S-EPMC4143561 | biostudies-literature
| S-EPMC3230912 | biostudies-literature
| S-EPMC2740938 | biostudies-literature
| S-EPMC6004421 | biostudies-literature
| S-EPMC2241869 | biostudies-literature
| S-EPMC8128240 | biostudies-literature
| S-EPMC4143638 | biostudies-literature