Unknown

Dataset Information

0

Statistical Inference for High-Dimensional Generalized Linear Models with Binary Outcomes.


ABSTRACT: This paper develops a unified statistical inference framework for high-dimensional binary generalized linear models (GLMs) with general link functions. Both unknown and known design distribution settings are considered. A two-step weighted bias-correction method is proposed for constructing confidence intervals and simultaneous hypothesis tests for individual components of the regression vector. Minimax lower bound for the expected length is established and the proposed confidence intervals are shown to be rate-optimal up to a logarithmic factor. The numerical performance of the proposed procedure is demonstrated through simulation studies and an analysis of a single cell RNA-seq data set, which yields interesting biological insights that integrate well into the current literature on the cellular immune response mechanisms as characterized by single-cell transcriptomics. The theoretical analysis provides important insights on the adaptivity of optimal confidence intervals with respect to the sparsity of the regression vector. New lower bound techniques are introduced and they can be of independent interest to solve other inference problems in high-dimensional binary GLMs.

SUBMITTER: Cai TT 

PROVIDER: S-EPMC10292730 | biostudies-literature | 2023

REPOSITORIES: biostudies-literature

altmetric image

Publications

Statistical Inference for High-Dimensional Generalized Linear Models with Binary Outcomes.

Cai T Tony TT   Guo Zijian Z   Ma Rong R  

Journal of the American Statistical Association 20211209 542


This paper develops a unified statistical inference framework for high-dimensional binary generalized linear models (GLMs) with general link functions. Both unknown and known design distribution settings are considered. A two-step weighted bias-correction method is proposed for constructing confidence intervals and simultaneous hypothesis tests for individual components of the regression vector. Minimax lower bound for the expected length is established and the proposed confidence intervals are  ...[more]

Similar Datasets

| S-EPMC9427730 | biostudies-literature
| S-EPMC8442657 | biostudies-literature
| S-EPMC6750760 | biostudies-literature
| S-EPMC10982637 | biostudies-literature
| S-EPMC9933885 | biostudies-literature
| S-EPMC2883299 | biostudies-literature
| S-EPMC3866838 | biostudies-literature
| S-EPMC6032567 | biostudies-literature
| S-EPMC7425805 | biostudies-literature
| S-EPMC6431156 | biostudies-literature