Dataset Information

A comparison of two classes of methods for estimating false discovery rates in microarray studies.

ABSTRACT: The goal of many microarray studies is to identify genes that are differentially expressed between two classes or populations. Many data analysts choose to estimate the false discovery rate (FDR) associated with the list of genes declared differentially expressed. Estimating an FDR largely reduces to estimating ? 1, the proportion of differentially expressed genes among all analyzed genes. Estimating ? 1 is usually done through P-values, but computing P-values can be viewed as a nuisance and potentially problematic step. We evaluated methods for estimating ? 1 directly from test statistics, circumventing the need to compute P-values. We adapted existing methodology for estimating ? 1 from t- and z-statistics so that ? 1 could be estimated from other statistics. We compared the quality of these estimates to estimates generated by two established methods for estimating ? 1 from P-values. Overall, methods varied widely in bias and variability. The least biased and least variable estimates of ? 1, the proportion of differentially expressed genes, were produced by applying the "convest" mixture model method to P-values computed from a pooled permutation null distribution. Estimates computed directly from test statistics rather than P-values did not reliably perform well.

SUBMITTER: Hansen E

PROVIDER: S-EPMC3820438 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A comparison of two classes of methods for estimating false discovery rates in microarray studies.

Hansen Emily E Kerr Kathleen F KF

Scientifica 20120708

The goal of many microarray studies is to identify genes that are differentially expressed between two classes or populations. Many data analysts choose to estimate the false discovery rate (FDR) associated with the list of genes declared differentially expressed. Estimating an FDR largely reduces to estimating π 1, the proportion of differentially expressed genes among all analyzed genes. Estimating π 1 is usually done through P-values, but computing P-values can be viewed as a nuisance and pot ...[more]

PMID: 24278709

Similar Datasets

Project description:PurposeA number of recent publications have proposed that a family of image-derived indices, called texture features, can predict clinical outcome in patients with cancer. However, the investigation of multiple indices on a single data set can lead to significant inflation of type-I errors. We report a systematic review of the type-I error inflation in such studies and review the evidence regarding associations between patient outcome and texture features derived from positron emission tomography (PET) or computed tomography (CT) images.MethodsFor study identification PubMed and Scopus were searched (1/2000-9/2013) using combinations of the keywords texture, prognostic, predictive and cancer. Studies were divided into three categories according to the sources of the type-I error inflation and the use or not of an independent validation dataset. For each study, the true type-I error probability and the adjusted level of significance were estimated using the optimum cut-off approach correction, and the Benjamini-Hochberg method. To demonstrate explicitly the variable selection bias in these studies, we re-analyzed data from one of the published studies, but using 100 random variables substituted for the original image-derived indices. The significance of the random variables as potential predictors of outcome was examined using the analysis methods used in the identified studies.ResultsFifteen studies were identified. After applying appropriate statistical corrections, an average type-I error probability of 76% (range: 34-99%) was estimated with the majority of published results not reaching statistical significance. Only 3/15 studies used a validation dataset. For the 100 random variables examined, 10% proved to be significant predictors of survival when subjected to ROC and multiple hypothesis testing analysis.ConclusionsWe found insufficient evidence to support a relationship between PET or CT texture features and patient survival. Further fit for purpose validation of these image-derived biomarkers should be supported by appropriate biological and statistical evidence before their association with patient outcome is investigated in prospective studies.

Dataset Information

A comparison of two classes of methods for estimating false discovery rates in microarray studies.

Publications

A comparison of two classes of methods for estimating false discovery rates in microarray studies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets