Unknown

Dataset Information

0

A fast divide-and-conquer sparse Cox regression.


ABSTRACT: We propose a computationally and statistically efficient divide-and-conquer (DAC) algorithm to fit sparse Cox regression to massive datasets where the sample size $n_0$ is exceedingly large and the covariate dimension $p$ is not small but $n_0\gg p$. The proposed algorithm achieves computational efficiency through a one-step linear approximation followed by a least square approximation to the partial likelihood (PL). These sequences of linearization enable us to maximize the PL with only a small subset and perform penalized estimation via a fast approximation to the PL. The algorithm is applicable for the analysis of both time-independent and time-dependent survival data. Simulations suggest that the proposed DAC algorithm substantially outperforms the full sample-based estimators and the existing DAC algorithm with respect to the computational speed, while it achieves similar statistical efficiency as the full sample-based estimators. The proposed algorithm was applied to extraordinarily large survival datasets for the prediction of heart failure-specific readmission within 30 days among Medicare heart failure patients.

SUBMITTER: Wang Y 

PROVIDER: S-EPMC8036003 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC7446356 | biostudies-literature
| S-EPMC1952108 | biostudies-literature
| S-EPMC2853773 | biostudies-literature
| S-EPMC7727519 | biostudies-literature
| S-EPMC5679152 | biostudies-literature
| S-EPMC5860120 | biostudies-literature
| S-EPMC3936251 | biostudies-literature
| S-EPMC5812974 | biostudies-literature
| S-EPMC6642500 | biostudies-literature
| S-EPMC3236195 | biostudies-literature