Unknown

Dataset Information

0

Practical issues in screening and variable selection in genome-wide association analysis.


ABSTRACT: Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus on identifying single nucleotide polymorphisms (SNPs) associated with a disease of interest, have produced ultrahigh-dimensional data. Numerous methods have been proposed to handle GWAS data. Most statistical methods have adopted a two-stage approach: pre-screening for dimensional reduction and variable selection to identify causal SNPs. The pre-screening step selects SNPs in terms of their P-values or the absolute values of the regression coefficients in single SNP analysis. Penalized regressions, such as the ridge, lasso, adaptive lasso, and elastic-net regressions, are commonly used for the variable selection step. In this paper, we investigate which combination of pre-screening method and penalized regression performs best on a quantitative phenotype using two real GWAS datasets.

SUBMITTER: Hong S 

PROVIDER: S-EPMC4298256 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Practical issues in screening and variable selection in genome-wide association analysis.

Hong Sungyeon S   Kim Yongkang Y   Park Taesung T  

Cancer informatics 20140101 Suppl 7


Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus on identifying single nucleotide polymorphisms (SNPs) associated with a disease of interest, have produced ultrahigh-dimensional data. Numerous methods have been proposed to handle GWAS data. Most sta  ...[more]

Similar Datasets

| S-EPMC6863898 | biostudies-literature
| S-EPMC3025714 | biostudies-literature
| S-EPMC4715655 | biostudies-literature
| S-EPMC6302495 | biostudies-other
| S-EPMC4866522 | biostudies-literature
| S-EPMC3172928 | biostudies-literature
| S-EPMC3499564 | biostudies-literature
| S-EPMC4736152 | biostudies-literature
| S-EPMC4275962 | biostudies-literature
| S-EPMC2585794 | biostudies-literature