Unknown

Dataset Information

0

Elastic-net regularization approaches for genome-wide association studies of rheumatoid arthritis.


ABSTRACT: The current trend in genome-wide association studies is to identify regions where the true disease-causing genes may lie by evaluating thousands of single-nucleotide polymorphisms (SNPs) across the whole genome. However, many challenges exist in detecting disease-causing genes among the thousands of SNPs. Examples include multicollinearity and multiple testing issues, especially when a large number of correlated SNPs are simultaneously tested. Multicollinearity can often occur when predictor variables in a multiple regression model are highly correlated, and can cause imprecise estimation of association. In this study, we propose a simple stepwise procedure that identifies disease-causing SNPs simultaneously by employing elastic-net regularization, a variable selection method that allows one to address multicollinearity. At Step 1, the single-marker association analysis was conducted to screen SNPs. At Step 2, the multiple-marker association was scanned based on the elastic-net regularization. The proposed approach was applied to the rheumatoid arthritis (RA) case-control data set of Genetic Analysis Workshop 16. While the selected SNPs at the screening step are located mostly on chromosome 6, the elastic-net approach identified putative RA-related SNPs on other chromosomes in an increased proportion. For some of those putative RA-related SNPs, we identified the interactions with sex, a well known factor affecting RA susceptibility.

SUBMITTER: Cho S 

PROVIDER: S-EPMC2795922 | biostudies-literature | 2009 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Elastic-net regularization approaches for genome-wide association studies of rheumatoid arthritis.

Cho Seoae S   Kim Haseong H   Oh Sohee S   Kim Kyunga K   Park Taesung T  

BMC proceedings 20091215


The current trend in genome-wide association studies is to identify regions where the true disease-causing genes may lie by evaluating thousands of single-nucleotide polymorphisms (SNPs) across the whole genome. However, many challenges exist in detecting disease-causing genes among the thousands of SNPs. Examples include multicollinearity and multiple testing issues, especially when a large number of correlated SNPs are simultaneously tested. Multicollinearity can often occur when predictor var  ...[more]

Similar Datasets

| S-EPMC3850240 | biostudies-literature
| S-EPMC5005471 | biostudies-literature
| S-EPMC3856324 | biostudies-literature
| S-EPMC3118953 | biostudies-literature
| S-EPMC2891422 | biostudies-other