Dataset Information

Efficient Regularized Regression with L0 Penalty for Variable Selection and Network Construction.

ABSTRACT: Variable selections for regression with high-dimensional big data have found many applications in bioinformatics and computational biology. One appealing approach is the L0 regularized regression which penalizes the number of nonzero features in the model directly. However, it is well known that L0 optimization is NP-hard and computationally challenging. In this paper, we propose efficient EM (L0EM) and dual L0EM (DL0EM) algorithms that directly approximate the L0 optimization problem. While L0EM is efficient with large sample size, DL0EM is efficient with high-dimensional (n ? m) data. They also provide a natural solution to all Lp ??p ? [0,2] problems, including lasso with p = 1 and elastic net with p ? [1,2]. The regularized parameter ? can be determined through cross validation or AIC and BIC. We demonstrate our methods through simulation and high-dimensional genomic data. The results indicate that L0 has better performance than lasso, SCAD, and MC+, and L0 with AIC or BIC has similar performance as computationally intensive cross validation. The proposed algorithms are efficient in identifying the nonzero variables with less bias and constructing biologically important networks with high-dimensional big data.

SUBMITTER: Liu Z

PROVIDER: S-EPMC5098106 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Efficient Regularized Regression with L0 Penalty for Variable Selection and Network Construction.

Liu Zhenqiu Z Li Gang G

Computational and mathematical methods in medicine 20161024

Variable selections for regression with high-dimensional big data have found many applications in bioinformatics and computational biology. One appealing approach is the L0 regularized regression which penalizes the number of nonzero features in the model directly. However, it is well known that L0 optimization is NP-hard and computationally challenging. In this paper, we propose efficient EM (L0EM) and dual L0EM (DL< ...[more]

PMID: 27843486

Dataset Information

Efficient Regularized Regression with L0 Penalty for Variable Selection and Network Construction.

Publications

Efficient Regularized Regression with <i>L</i><sub>0</sub> Penalty for Variable Selection and Network Construction.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Multilevel regularized regression for simultaneous taxa selection and network construction with metagenomic count data.
| S-EPMC4382903 | biostudies-other

Stable variable ranking and selection in regularized logistic regression for severely imbalanced big binary data
| S-EPMC9844919 | biostudies-literature

Improving stability and understandability of genotype-phenotype mapping in Saccharomyces using regularized variable selection in L-PLS regression.
| S-EPMC3598729 | biostudies-literature

Variable selection in ROC regression.
| S-EPMC3838845 | biostudies-other

Joint Bayesian variable and graph selection for regression models with network-structured predictors.
| S-EPMC4775388 | biostudies-literature

Weighted Graph Regularized Sparse Brain Network Construction for MCI Identification.
| S-EPMC6774646 | biostudies-literature

Variable Selection in Function-on-Scalar Regression.
| S-EPMC4943585 | biostudies-literature

Bayesian Approximate Kernel Regression with Variable Selection.
| S-EPMC6383716 | biostudies-literature

Fast, Exact Model Selection and Permutation Testing for ?2-Regularized Logistic Regression.
| S-EPMC3875235 | biostudies-literature

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA.
| S-EPMC4549005 | biostudies-literature