Dataset Information

A simple but highly effective approach to evaluate the prognostic performance of gene expression signatures.

ABSTRACT: BACKGROUND: Highly parallel analysis of gene expression has recently been used to identify gene sets or 'signatures' to improve patient diagnosis and risk stratification. Once a signature is generated, traditional statistical testing is used to evaluate its prognostic performance. However, due to the dimensionality of microarrays, this can lead to false interpretation of these signatures. PRINCIPAL FINDINGS: A method was developed to test batches of a user-specified number of randomly chosen signatures in patient microarray datasets. The percentage of random generated signatures yielding prognostic value was assessed using ROC analysis by calculating the area under the curve (AUC) in six public available cancer patient microarray datasets. We found that a signature consisting of randomly selected genes has an average 10% chance of reaching significance when assessed in a single dataset, but can range from 1% to ?40% depending on the dataset in question. Increasing the number of validation datasets markedly reduces this number. CONCLUSIONS: We have shown that the use of an arbitrary cut-off value for evaluation of signature significance is not suitable for this type of research, but should be defined for each dataset separately. Our method can be used to establish and evaluate signature performance of any derived gene signature in a dataset by comparing its performance to thousands of randomly generated signatures. It will be of most interest for cases where few data are available and testing in multiple datasets is limited.

SUBMITTER: Starmans MH

PROVIDER: S-EPMC3233554 | biostudies-literature | 2011

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A simple but highly effective approach to evaluate the prognostic performance of gene expression signatures.

Starmans Maud H W MH Fung Glenn G Steck Harald H Wouters Bradly G BG Lambin Philippe P

PloS one 20111207 12

<h4>Background</h4>Highly parallel analysis of gene expression has recently been used to identify gene sets or 'signatures' to improve patient diagnosis and risk stratification. Once a signature is generated, traditional statistical testing is used to evaluate its prognostic performance. However, due to the dimensionality of microarrays, this can lead to false interpretation of these signatures.<h4>Principal findings</h4>A method was developed to test batches of a user-specified number of random ...[more]

PMID: 22163293

Dataset Information

A simple but highly effective approach to evaluate the prognostic performance of gene expression signatures.

Publications

A simple but highly effective approach to evaluate the prognostic performance of gene expression signatures.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Gene expression signatures for highly effective dermal sheath cup cells
2025-02-18 | GSE197025 | GEO

Scanning single-molecule counting system for Eprobe with highly simple and effective approach.
| S-EPMC7737986 | biostudies-literature

Gene expression signatures for highly effective dermal sheath cup cells
| PRJNA808415 | ENA

Inferring Diagnostic and Prognostic Gene Expression Signatures Across WHO Glioma Classifications: A Network-Based Approach.
| S-EPMC11403688 | biostudies-literature

A gene sets approach for identifying prognostic gene signatures for outcome prediction.
| S-EPMC2364634 | biostudies-literature

Prognostic Cancer Gene Expression Signatures: Current Status and Challenges.
| S-EPMC8000474 | biostudies-literature

Comparison of prognostic gene expression signatures for breast cancer.
| S-EPMC2533026 | biostudies-literature

Robust method for identification of prognostic gene signatures from gene expression profiles.
| S-EPMC5717170 | biostudies-literature

Identification of common prognostic gene expression signatures with biological meanings from microarray gene expression datasets.
| S-EPMC3448701 | biostudies-literature

RNAflow: An Effective and Simple RNA-Seq Differential Gene Expression Pipeline Using Nextflow.
| S-EPMC7763471 | biostudies-literature