Unknown

Dataset Information

0

Promoting Similarity of Sparsity Structures in Integrative Analysis with Penalization.


ABSTRACT: For data with high-dimensional covariates but small sample sizes, the analysis of single datasets often generates unsatisfactory results. The integrative analysis of multiple independent datasets provides an effective way of pooling information and outperforms single-dataset and several alternative multi-datasets methods. Under many scenarios, multiple datasets are expected to share common important covariates, that is, the corresponding models have similarity in their sparsity structures. However, the existing methods do not have a mechanism to promote the similarity in sparsity structures in integrative analysis. In this study, we consider penalized variable selection and estimation in integrative analysis. We develop an L0-penalty based method, which explicitly promotes the similarity in sparsity structures. Computationally it is realized using a coordinate descent algorithm. Theoretically it has the selection and estimation consistency properties. Under a wide spectrum of simulation scenarios, it has identification and estimation performance comparable to or better than the alternatives. In the analysis of three lung cancer datasets with gene expression measurements, it identifies genes with sound biological implications and satisfactory prediction performance.

SUBMITTER: Huang Y 

PROVIDER: S-EPMC6086364 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

altmetric image

Publications

Promoting Similarity of Sparsity Structures in Integrative Analysis with Penalization.

Huang Yuan Y   Zhang Qingzhao Q   Zhang Sanguo S   Huang Jian J   Ma Shuangge S  

Journal of the American Statistical Association 20170503 517


For data with high-dimensional covariates but small sample sizes, the analysis of single datasets often generates unsatisfactory results. The integrative analysis of multiple independent datasets provides an effective way of pooling information and outperforms single-dataset and several alternative multi-datasets methods. Under many scenarios, multiple datasets are expected to share common important covariates, that is, the corresponding models have similarity in their sparsity structures. Howev  ...[more]

Similar Datasets

| S-EPMC3933169 | biostudies-literature
| S-EPMC4355402 | biostudies-literature
| S-EPMC4015993 | biostudies-literature
| S-EPMC4560757 | biostudies-literature
| S-EPMC8653861 | biostudies-literature
| S-EPMC8647179 | biostudies-literature
| S-EPMC5931565 | biostudies-literature
| S-EPMC6988016 | biostudies-literature
| S-EPMC6157158 | biostudies-literature
| S-EPMC3804306 | biostudies-literature