Unknown

Dataset Information

0

Bayesian variable selection for binary outcomes in high-dimensional genomic studies using non-local priors.


ABSTRACT:

Motivation

The advent of new genomic technologies has resulted in the production of massive data sets. Analyses of these data require new statistical and computational methods. In this article, we propose one such method that is useful in selecting explanatory variables for prediction of a binary response. Although this problem has recently been addressed using penalized likelihood methods, we adopt a Bayesian approach that utilizes a mixture of non-local prior densities and point masses on the binary regression coefficient vectors.

Results

The resulting method, which we call iMOMLogit, provides improved performance in identifying true models and reducing estimation and prediction error in a number of simulation studies. More importantly, its application to several genomic datasets produces predictions that have high accuracy using far fewer explanatory variables than competing methods. We also describe a novel approach for setting prior hyperparameters by examining the total variation distance between the prior distributions on the regression parameters and the distribution of the maximum likelihood estimator under the null distribution. Finally, we describe a computational algorithm that can be used to implement iMOMLogit in ultrahigh-dimensional settings ([Formula: see text]) and provide diagnostics to assess the probability that this algorithm has identified the highest posterior probability model.

Availability and implementation

Software to implement this method can be downloaded at: http://www.stat.tamu.edu/?amir/code.html

Contact

wwang7@mdanderson.org or vjohnson@stat.tamu.edu

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Nikooienejad A 

PROVIDER: S-EPMC4848399 | biostudies-literature | 2016 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Bayesian variable selection for binary outcomes in high-dimensional genomic studies using non-local priors.

Nikooienejad Amir A   Wang Wenyi W   Johnson Valen E VE  

Bioinformatics (Oxford, England) 20160106 9


<h4>Motivation</h4>The advent of new genomic technologies has resulted in the production of massive data sets. Analyses of these data require new statistical and computational methods. In this article, we propose one such method that is useful in selecting explanatory variables for prediction of a binary response. Although this problem has recently been addressed using penalized likelihood methods, we adopt a Bayesian approach that utilizes a mixture of non-local prior densities and point masses  ...[more]

Similar Datasets

| S-EPMC4275962 | biostudies-literature
| S-EPMC7487595 | biostudies-literature
| S-EPMC5025605 | biostudies-literature
| S-EPMC8036799 | biostudies-literature
| S-EPMC6222001 | biostudies-literature
| S-EPMC7486995 | biostudies-literature
| S-EPMC3700986 | biostudies-literature
| S-EPMC5478010 | biostudies-literature
| S-EPMC5885321 | biostudies-literature
| S-EPMC3587767 | biostudies-literature