Unknown

Dataset Information

0

Sparse principal component analysis by choice of norm.


ABSTRACT: Recent years have seen the developments of several methods for sparse principal component analysis due to its importance in the analysis of high dimensional data. Despite the demonstration of their usefulness in practical applications, they are limited in terms of lack of orthogonality in the loadings (coefficients) of different principal components, the existence of correlation in the principal components, the expensive computation needed, and the lack of theoretical results such as consistency in high-dimensional situations. In this paper, we propose a new sparse principal component analysis method by introducing a new norm to replace the usual norm in traditional eigenvalue problems, and propose an efficient iterative algorithm to solve the optimization problems. With this method, we can efficiently obtain uncorrelated principal components or orthogonal loadings, and achieve the goal of explaining a high percentage of variations with sparse linear combinations. Due to the strict convexity of the new norm, we can prove the convergence of the iterative method and provide the detailed characterization of the limits. We also prove that the obtained principal component is consistent for a single component model in high dimensional situations. As illustration, we apply this method to real gene expression data with competitive results.

SUBMITTER: Qi X 

PROVIDER: S-EPMC3601508 | biostudies-literature | 2013 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Sparse principal component analysis by choice of norm.

Qi Xin X   Luo Ruiyan R   Zhao Hongyu H  

Journal of multivariate analysis 20120716


Recent years have seen the developments of several methods for sparse principal component analysis due to its importance in the analysis of high dimensional data. Despite the demonstration of their usefulness in practical applications, they are limited in terms of lack of orthogonality in the loadings (coefficients) of different principal components, the existence of correlation in the principal components, the expensive computation needed, and the lack of theoretical results such as consistency  ...[more]

Similar Datasets

| S-EPMC3746759 | biostudies-literature
| S-EPMC4032817 | biostudies-other
| S-EPMC4394907 | biostudies-literature
| S-EPMC7449232 | biostudies-literature
| S-EPMC5912177 | biostudies-literature
| S-EPMC3215429 | biostudies-literature
| S-EPMC4510534 | biostudies-literature
| S-EPMC2902448 | biostudies-literature
2011-08-15 | GSE31375 | GEO
| S-EPMC2835171 | biostudies-literature