Dataset Information

Significance testing in ridge regression for genetic data.

ABSTRACT:

Background

Technological developments have increased the feasibility of large scale genetic association studies. Densely typed genetic markers are obtained using SNP arrays, next-generation sequencing technologies and imputation. However, SNPs typed using these methods can be highly correlated due to linkage disequilibrium among them, and standard multiple regression techniques fail with these data sets due to their high dimensionality and correlation structure. There has been increasing interest in using penalised regression in the analysis of high dimensional data. Ridge regression is one such penalised regression technique which does not perform variable selection, instead estimating a regression coefficient for each predictor variable. It is therefore desirable to obtain an estimate of the significance of each ridge regression coefficient.

Results

We develop and evaluate a test of significance for ridge regression coefficients. Using simulation studies, we demonstrate that the performance of the test is comparable to that of a permutation test, with the advantage of a much-reduced computational cost. We introduce the p-value trace, a plot of the negative logarithm of the p-values of ridge regression coefficients with increasing shrinkage parameter, which enables the visualisation of the change in p-value of the regression coefficients with increasing penalisation. We apply the proposed method to a lung cancer case-control data set from EPIC, the European Prospective Investigation into Cancer and Nutrition.

Conclusions

The proposed test is a useful alternative to a permutation test for the estimation of the significance of ridge regression coefficients, at a much-reduced computational cost. The p-value trace is an informative graphical tool for evaluating the results of a test of significance of ridge regression coefficients as the shrinkage parameter increases, and the proposed test makes its production computationally feasible.

SUBMITTER: Cule E

PROVIDER: S-EPMC3228544 | biostudies-literature | 2011 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Significance testing in ridge regression for genetic data.

Cule Erika E Vineis Paolo P De Iorio Maria M

BMC bioinformatics 20110919

<h4>Background</h4>Technological developments have increased the feasibility of large scale genetic association studies. Densely typed genetic markers are obtained using SNP arrays, next-generation sequencing technologies and imputation. However, SNPs typed using these methods can be highly correlated due to linkage disequilibrium among them, and standard multiple regression techniques fail with these data sets due to their high dimensionality and correlation structure. There has been increasing ...[more]

PMID: 21929786

Dataset Information

Significance testing in ridge regression for genetic data.

Background

Results

Conclusions

Publications

Significance testing in ridge regression for genetic data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Ridge regression and its applications in genetic studies.
| S-EPMC8031387 | biostudies-literature

Fractional ridge regression: a fast, interpretable reparameterization of ridge regression.
| S-EPMC7702219 | biostudies-literature

Accommodating linkage disequilibrium in genetic-association analyses via ridge regression.
| S-EPMC2427310 | biostudies-literature

Variable selection for recurrent event data with broken adaptive ridge regression.
| S-EPMC7523880 | biostudies-literature

Bayesian regression for group testing data.
| S-EPMC5638690 | biostudies-literature

Eigenvalue significance testing for genetic association.
| S-EPMC6069632 | biostudies-literature

Identification of correlated genetic variants jointly associated with rheumatoid arthritis using ridge regression.
| S-EPMC2795968 | biostudies-literature

Ridge regression in prediction problems: automatic choice of the ridge parameter.
| S-EPMC4377081 | biostudies-literature

Generalized additive regression for group testing data.
| S-EPMC8511943 | biostudies-literature

Feature-space selection with banded ridge regression.
| S-EPMC9807218 | biostudies-literature