Dataset Information

Efficient ℓ₀ -norm feature selection based on augmented and penalized minimization.

ABSTRACT: Advances in high-throughput technologies in genomics and imaging yield unprecedentedly large numbers of prognostic biomarkers. To accommodate the scale of biomarkers and study their association with disease outcomes, penalized regression is often used to identify important biomarkers. The ideal variable selection procedure would search for the best subset of predictors, which is equivalent to imposing an ℓ₀ -penalty on the regression coefficients. Since this optimization is a nondeterministic polynomial-time hard (NP-hard) problem that does not scale with number of biomarkers, alternative methods mostly place smooth penalties on the regression parameters, which lead to computationally feasible optimization problems. However, empirical studies and theoretical analyses show that convex approximation of ℓ₀ -norm (eg, ℓ₁ ) does not outperform their ℓ₀ counterpart. The progress for ℓ₀ -norm feature selection is relatively slower, where the main methods are greedy algorithms such as stepwise regression or orthogonal matching pursuit. Penalized regression based on regularizing ℓ₀ -norm remains much less explored in the literature. In this work, inspired by the recently popular augmenting and data splitting algorithms including alternating direction method of multipliers, we propose a 2-stage procedure for ℓ₀ -penalty variable selection, referred to as augmented penalized minimization-L₀ (APM-L₀ ). The APM-L₀ targets ℓ₀ -norm as closely as possible while keeping computation tractable, efficient, and simple, which is achieved by iterating between a convex regularized regression and a simple hard-thresholding estimation. The procedure can be viewed as arising from regularized optimization with truncated ℓ₁ norm. Thus, we propose to treat regularization parameter and thresholding parameter as tuning parameters and select based on cross-validation. A 1-step coordinate descent algorithm is used in the first stage to significantly improve computational efficiency. Through extensive simulation studies and real data application, we demonstrate superior performance of the proposed method in terms of selection accuracy and computational speed as compared to existing methods. The proposed APM-L₀ procedure is implemented in the R-package APML0.

SUBMITTER: Li X

PROVIDER: S-EPMC5768461 | biostudies-literature | 2018 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Efficient ℓ<sub>0</sub> -norm feature selection based on augmented and penalized minimization.

Li Xiang X Xie Shanghong S Zeng Donglin D Wang Yuanjia Y

Statistics in medicine 20171030 3

Advances in high-throughput technologies in genomics and imaging yield unprecedentedly large numbers of prognostic biomarkers. To accommodate the scale of biomarkers and study their association with disease outcomes, penalized regression is often used to identify important biomarkers. The ideal variable selection procedure would search for the best subset of predictors, which is equivalent to imposing an ℓ<sub>0</sub> -penalty on the regression coefficients. Since this optimization is a nondeter ...[more]

PMID: 29082539

Dataset Information

Efficient ℓ₀ -norm feature selection based on augmented and penalized minimization.

Publications

Efficient ℓ<sub>0</sub> -norm feature selection based on augmented and penalized minimization.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

MIP-BOOST: Efficient and Effective <i>L</i> <sub>0</sub> Feature Selection for Linear Regression.
| S-EPMC9673824 | biostudies-literature

Sparse Index Clones via the sorted <i>ℓ</i> <sub>1</sub> - Norm.
| S-EPMC9031478 | biostudies-literature

Unsupervised feature selection algorithm based on L 2,p-norm feature reconstruction.
| S-EPMC11875355 | biostudies-literature

Sparse Canonical Correlation Analysis via Truncated <i>ℓ</i><sub>1</sub>-norm with Application to Brain Imaging Genetics.
| S-EPMC5627624 | biostudies-literature

Fast, Exact Model Selection and Permutation Testing for ℓ<sub>2</sub>-Regularized Logistic Regression.
| S-EPMC3875235 | biostudies-literature

Efficient Regularized Regression with <i>L</i><sub>0</sub> Penalty for Variable Selection and Network Construction.
| S-EPMC5098106 | biostudies-literature

A surrogate ℓ<sub>0</sub> sparse Cox's regression with applications to sparse high-dimensional massive sample size time-to-event data.
| S-EPMC8386178 | biostudies-literature

Efficient cross-validation traversals in feature subset selection.
| S-EPMC9744898 | biostudies-literature

Efficient feature selection and classification for microarray data.
| S-EPMC6101392 | biostudies-literature

Kruskal-Wallis-based computationally efficient feature selection for face recognition.
| S-EPMC4054616 | biostudies-other

Dataset Information

Efficient ℓ0 -norm feature selection based on augmented and penalized minimization.

Publications

Efficient ℓ<sub>0</sub> -norm feature selection based on augmented and penalized minimization.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Efficient ℓ₀ -norm feature selection based on augmented and penalized minimization.