Unknown

Dataset Information

0

An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies.


ABSTRACT: Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker-single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency,[Formula: see text], compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. ? ?0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals ([Formula: see text]). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol.

SUBMITTER: Dai H 

PROVIDER: S-EPMC4933358 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies.

Dai Hongying H   Wu Guodong G   Wu Michael M   Zhi Degui D  

PloS one 20160705 7


Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker-single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that  ...[more]

Similar Datasets

| S-EPMC11326560 | biostudies-literature
| S-EPMC9222514 | biostudies-literature
| S-EPMC10634611 | biostudies-literature
| S-EPMC7118831 | biostudies-literature
| S-EPMC9407540 | biostudies-literature
| S-EPMC3399813 | biostudies-literature
| S-EPMC3118166 | biostudies-literature
| S-EPMC10950455 | biostudies-literature
| S-EPMC7946427 | biostudies-literature
| S-EPMC3440237 | biostudies-literature