Unknown

Dataset Information

0

High-dimensional variable selection for ordinal outcomes with error control.


ABSTRACT: Many high-throughput genomic applications involve a large set of potential covariates and a response which is frequently measured on an ordinal scale, and it is crucial to identify which variables are truly associated with the response. Effectively controlling the false discovery rate (FDR) without sacrificing power has been a major challenge in variable selection research. This study reviews two existing variable selection frameworks, model-X knockoffs and a modified version of reference distribution variable selection (RDVS), both of which utilize artificial variables as benchmarks for decision making. Model-X knockoffs constructs a 'knockoff' variable for each covariate to mimic the covariance structure, while RDVS generates only one null variable and forms a reference distribution by performing multiple runs of model fitting. Herein, we describe how different importance measures for ordinal responses can be constructed that fit into these two selection frameworks, using either penalized regression or machine learning techniques. We compared these measures in terms of the FDR and power using simulated data. Moreover, we applied these two frameworks to high-throughput methylation data for identifying features associated with the progression from normal liver tissue to hepatocellular carcinoma to further compare and contrast their performances.

SUBMITTER: Fu H 

PROVIDER: S-EPMC7820886 | biostudies-literature | 2021 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

High-dimensional variable selection for ordinal outcomes with error control.

Fu Han H   Archer Kellie J KJ  

Briefings in bioinformatics 20210101 1


Many high-throughput genomic applications involve a large set of potential covariates and a response which is frequently measured on an ordinal scale, and it is crucial to identify which variables are truly associated with the response. Effectively controlling the false discovery rate (FDR) without sacrificing power has been a major challenge in variable selection research. This study reviews two existing variable selection frameworks, model-X knockoffs and a modified version of reference distri  ...[more]

Similar Datasets

| S-EPMC5002494 | biostudies-literature
| S-EPMC5478010 | biostudies-literature
| S-EPMC7487595 | biostudies-literature
| S-EPMC4848399 | biostudies-literature
| S-EPMC7133715 | biostudies-literature
| S-EPMC6222001 | biostudies-literature
| S-EPMC3478096 | biostudies-literature
| S-EPMC5885321 | biostudies-literature
| S-EPMC3587767 | biostudies-literature