Unknown

Dataset Information

0

A comparative study of variable selection methods in the context of developing psychiatric screening instruments.


ABSTRACT: The development of screening instruments for psychiatric disorders involves item selection from a pool of items in existing questionnaires assessing clinical and behavioral phenotypes. A screening instrument should consist of only a few items and have good accuracy in classifying cases and non-cases. Variable/item selection methods such as Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, Classification and Regression Tree, Random Forest, and the two-sample t-test can be used in such context. Unlike situations where variable selection methods are most commonly applied (e.g., ultra high-dimensional genetic or imaging data), psychiatric data usually have lower dimensions and are characterized by the following factors: correlations and possible interactions among predictors, unobservability of important variables (i.e., true variables not measured by available questionnaires), amount and pattern of missing values in the predictors, and prevalence of cases in the training data. We investigate how these factors affect the performance of several variable selection methods and compare them with respect to selection performance and prediction error rate via simulations. Our results demonstrated that: (1) for complete data, LASSO and Elastic Net outperformed other methods with respect to variable selection and future data prediction, and (2) for certain types of incomplete data, Random Forest induced bias in imputation, leading to incorrect ranking of variable importance. We propose the Imputed-LASSO combining Random Forest imputation and LASSO; this approach offsets the bias in Random Forest and offers a simple yet efficient item selection approach for missing data. As an illustration, we apply the methods to items from the standard Autism Diagnostic Interview-Revised version.

SUBMITTER: Lu F 

PROVIDER: S-EPMC4026268 | biostudies-literature | 2014 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

A comparative study of variable selection methods in the context of developing psychiatric screening instruments.

Lu Feihan F   Petkova Eva E  

Statistics in medicine 20130811 3


The development of screening instruments for psychiatric disorders involves item selection from a pool of items in existing questionnaires assessing clinical and behavioral phenotypes. A screening instrument should consist of only a few items and have good accuracy in classifying cases and non-cases. Variable/item selection methods such as Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, Classification and Regression Tree, Random Forest, and the two-sample t-test can be used  ...[more]

Similar Datasets

| S-EPMC10495967 | biostudies-literature
| S-EPMC8465777 | biostudies-literature
| S-EPMC5877510 | biostudies-literature
| S-EPMC11018091 | biostudies-literature
| S-EPMC6738588 | biostudies-literature
| S-EPMC6214199 | biostudies-literature
| S-EPMC8189011 | biostudies-literature
| S-EPMC9235102 | biostudies-literature
| S-EPMC3155752 | biostudies-literature
| S-EPMC3169938 | biostudies-literature