Unknown

Dataset Information

0

Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.


ABSTRACT: Gene set methods aim to assess the overall evidence of association of a set of genes with a phenotype, such as disease or a quantitative trait. Multiple approaches for gene set analysis of expression data have been proposed. They can be divided into two types: competitive and self-contained. Benefits of self-contained methods include that they can be used for genome-wide, candidate gene, or pathway studies, and have been reported to be more powerful than competitive methods. We therefore investigated ten self-contained methods that can be used for continuous, discrete and time-to-event phenotypes. To assess the power and type I error rate for the various previously proposed and novel approaches, an extensive simulation study was completed in which the scenarios varied according to: number of genes in a gene set, number of genes associated with the phenotype, effect sizes, correlation between expression of genes within a gene set, and the sample size. In addition to the simulated data, the various methods were applied to a pharmacogenomic study of the drug gemcitabine. Simulation results demonstrated that overall Fisher's method and the global model with random effects have the highest power for a wide range of scenarios, while the analysis based on the first principal component and Kolmogorov-Smirnov test tended to have lowest power. The methods investigated here are likely to play an important role in identifying pathways that contribute to complex traits.

SUBMITTER: Fridley BL 

PROVIDER: S-EPMC2941449 | biostudies-literature | 2010 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.

Fridley Brooke L BL   Jenkins Gregory D GD   Biernacka Joanna M JM  

PloS one 20100917 9


Gene set methods aim to assess the overall evidence of association of a set of genes with a phenotype, such as disease or a quantitative trait. Multiple approaches for gene set analysis of expression data have been proposed. They can be divided into two types: competitive and self-contained. Benefits of self-contained methods include that they can be used for genome-wide, candidate gene, or pathway studies, and have been reported to be more powerful than competitive methods. We therefore investi  ...[more]

Similar Datasets

| S-EPMC3330217 | biostudies-literature
| S-EPMC2712751 | biostudies-literature
| S-EPMC3509490 | biostudies-literature
| S-EPMC2238724 | biostudies-literature
| S-EPMC4625461 | biostudies-literature
| S-EPMC2781749 | biostudies-literature
| S-EPMC5854612 | biostudies-literature
| S-EPMC3142525 | biostudies-literature
| S-EPMC5053608 | biostudies-literature
| S-EPMC1590054 | biostudies-literature