Unknown

Dataset Information

0

A quasi-Monte-Carlo comparison of parametric and semiparametric regression methods for heavy-tailed and non-normal data: an application to healthcare costs.


ABSTRACT: We conduct a quasi-Monte-Carlo comparison of the recent developments in parametric and semiparametric regression methods for healthcare costs, both against each other and against standard practice. The population of English National Health Service hospital in-patient episodes for the financial year 2007-2008 (summed for each patient) is randomly divided into two equally sized subpopulations to form an estimation set and a validation set. Evaluating out-of-sample using the validation set, a conditional density approximation estimator shows considerable promise in forecasting conditional means, performing best for accuracy of forecasting and among the best four for bias and goodness of fit. The best performing model for bias is linear regression with square-root-transformed dependent variables, whereas a generalized linear model with square-root link function and Poisson distribution performs best in terms of goodness of fit. Commonly used models utilizing a log-link are shown to perform badly relative to other models considered in our comparison.

SUBMITTER: Jones AM 

PROVIDER: S-EPMC5053270 | biostudies-literature | 2016 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

A quasi-Monte-Carlo comparison of parametric and semiparametric regression methods for heavy-tailed and non-normal data: an application to healthcare costs.

Jones Andrew M AM   Lomas James J   Moore Peter T PT   Rice Nigel N  

Journal of the Royal Statistical Society. Series A, (Statistics in Society) 20151015 4


We conduct a quasi-Monte-Carlo comparison of the recent developments in parametric and semiparametric regression methods for healthcare costs, both against each other and against standard practice. The population of English National Health Service hospital in-patient episodes for the financial year 2007-2008 (summed for each patient) is randomly divided into two equally sized subpopulations to form an estimation set and a validation set. Evaluating out-of-sample using the validation set, a condi  ...[more]

Similar Datasets

| S-EPMC8020077 | biostudies-literature
| S-EPMC5482548 | biostudies-literature
| S-EPMC4061254 | biostudies-literature
| S-EPMC7143414 | biostudies-literature
| S-EPMC7461964 | biostudies-literature
| S-EPMC6484004 | biostudies-literature
| S-EPMC8378757 | biostudies-literature
| S-EPMC10069785 | biostudies-literature
| S-EPMC10909586 | biostudies-literature
| S-EPMC10082974 | biostudies-literature