Dataset Information

Comparison of methods for early-readmission prediction in a high-dimensional heterogeneous covariates and time-to-event outcome framework.

ABSTRACT: BACKGROUND:Choosing the most performing method in terms of outcome prediction or variables selection is a recurring problem in prognosis studies, leading to many publications on methods comparison. But some aspects have received little attention. First, most comparison studies treat prediction performance and variable selection aspects separately. Second, methods are either compared within a binary outcome setting (where we want to predict whether the readmission will occur within an arbitrarily chosen delay or not) or within a survival analysis setting (where the outcomes are directly the censored times), but not both. In this paper, we propose a comparison methodology to weight up those different settings both in terms of prediction and variables selection, while incorporating advanced machine learning strategies. METHODS:Using a high-dimensional case study on a sickle-cell disease (SCD) cohort, we compare 8 statistical methods. In the binary outcome setting, we consider logistic regression (LR), support vector machine (SVM), random forest (RF), gradient boosting (GB) and neural network (NN); while on the survival analysis setting, we consider the Cox Proportional Hazards (PH), the CURE and the C-mix models. We also propose a method using Gaussian Processes to extract meaningfull structured covariates from longitudinal data. RESULTS:Among all assessed statistical methods, the survival analysis ones obtain the best results. In particular the C-mix model yields the better performances in both the two considered settings (AUC =0.94 in the binary outcome setting), as well as interesting interpretation aspects. There is some consistency in selected covariates across methods within a setting, but not much across the two settings. CONCLUSIONS:It appears that learning withing the survival analysis setting first (so using all the temporal information), and then going back to a binary prediction using the survival estimates gives significantly better prediction performances than the ones obtained by models trained "directly" within the binary outcome setting.

SUBMITTER: Bussy S

PROVIDER: S-EPMC6404305 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Comparison of methods for early-readmission prediction in a high-dimensional heterogeneous covariates and time-to-event outcome framework.

Bussy Simon S Veil Raphaël R Looten Vincent V Burgun Anita A Gaïffas Stéphane S Guilloux Agathe A Ranque Brigitte B Jannot Anne-Sophie AS

BMC medical research methodology 20190306 1

<h4>Background</h4>Choosing the most performing method in terms of outcome prediction or variables selection is a recurring problem in prognosis studies, leading to many publications on methods comparison. But some aspects have received little attention. First, most comparison studies treat prediction performance and variable selection aspects separately. Second, methods are either compared within a binary outcome setting (where we want to predict whether the readmission will occur within an arb ...[more]

PMID: 30841867

Similar Datasets

Project description:Background & aimsPatients with cirrhosis have high rates of hospital readmission, but prediction models are suboptimal and have not included important patient-reported outcome measures (PROMs). In a large prospective cohort, we examined the impact of PROMs on prediction of 30-day readmissions.MethodsWe performed a prospective cohort study of adults with cirrhosis admitted to a tertiary center between June 2014 and March 2020. We collected clinical information, socioeconomic status, and PROMs addressing functional status and quality of life. We used hierarchical competing risk time-to-event analysis to examine the impact of PROMs on readmission prediction.ResultsA total of 654 patients were discharged alive, and 247 (38%) were readmitted within 30 days. Readmission was independently associated with cerebrovascular disease, ascites, prior hospital admission, admission via the emergency department, lower albumin, higher Model for End-Stage Liver Disease, discharge with public transportation, and impaired basic activities of daily living and quality-of-life activity domain. Reduced readmission was associated with cancer, admission for infection, children at home, and impaired emotional function. Compared with a model including only clinical variables, addition of functional status and quality-of-life variables improved the area under the receiver-operating characteristic curve from 0.72 to 0.73 and 0.75, with net reclassification indices of 0.22 and 0.18, respectively. Socioeconomic variables did not significantly improve prediction compared with clinical variables alone. Compared with a model using electronically available variables only, no models improved prediction when examined with integrated discrimination improvement.ConclusionsPROMs may marginally add to the prediction of 30-day readmissions for patients with cirrhosis. Poor social support and disability are associated with readmissions and may be high-yield targets for future interventions.

Dataset Information

Comparison of methods for early-readmission prediction in a high-dimensional heterogeneous covariates and time-to-event outcome framework.

Publications

Comparison of methods for early-readmission prediction in a high-dimensional heterogeneous covariates and time-to-event outcome framework.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets