Dataset Information

Predicting hospitalization following psychiatric crisis care using machine learning.

ABSTRACT: Background: Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also evaluate an ensemble model to optimize the accuracy and we explore individual predictors of hospitalization.

Methods: Data from 2084 patients included in the longitudinal Amsterdam Study of Acute Psychiatry with at least one reported psychiatric crisis care contact were included. Target variable for the prediction models was whether the patient was hospitalized in the 12 months following inclusion. The predictive power of 39 variables related to patients' socio-demographics, clinical characteristics and previous mental health care contacts was evaluated. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared and we also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis and the five best performing algorithms were combined in an ensemble model using stacking.

Results: All models performed above chance level. We found Gradient Boosting to be the best performing algorithm (AUC?=?0.774) and K-Nearest Neighbors to be the least performing (AUC?=?0.702). The performance of GLM/logistic regression (AUC?=?0.76) was slightly above average among the tested algorithms. In a Net Reclassification Improvement analysis Gradient Boosting outperformed GLM/logistic regression by 2.9% and K-Nearest Neighbors by 11.3%. GLM/logistic regression outperformed K-Nearest Neighbors by 8.7%. Nine of the top-10 most important predictor variables were related to previous mental health care use.

Conclusions: Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was in most cases modest. The results show that a predictive accuracy similar to the best performing model can be achieved when combining multiple algorithms in an ensemble model.

SUBMITTER: Blankers M

PROVIDER: S-EPMC7731561 | biostudies-literature | 2020 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Predicting hospitalization following psychiatric crisis care using machine learning.

Blankers Matthijs M van der Post Louk F M LFM Dekker Jack J M JJM

BMC medical informatics and decision making 20201210 1

<h4>Background</h4>Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also ev ...[more]

PMID: 33302948

Similar Datasets

Project description:BACKGROUND:Suicide is a major public health concern globally. Accurately predicting suicidal behavior remains challenging. This study aimed to use machine learning approaches to examine the potential of the Swedish national registry data for prediction of suicidal behavior. METHODS AND FINDINGS:The study sample consisted of 541,300 inpatient and outpatient visits by 126,205 Sweden-born patients (54% female and 46% male) aged 18 to 39 (mean age at the visit: 27.3) years to psychiatric specialty care in Sweden between January 1, 2011 and December 31, 2012. The most common psychiatric diagnoses at the visit were anxiety disorders (20.0%), major depressive disorder (16.9%), and substance use disorders (13.6%). A total of 425 candidate predictors covering demographic characteristics, socioeconomic status (SES), electronic medical records, criminality, as well as family history of disease and crime were extracted from the Swedish registry data. The sample was randomly split into an 80% training set containing 433,024 visits and a 20% test set containing 108,276 visits. Models were trained separately for suicide attempt/death within 90 and 30 days following a visit using multiple machine learning algorithms. Model discrimination and calibration were both evaluated. Among all eligible visits, 3.5% (18,682) were followed by a suicide attempt/death within 90 days and 1.7% (9,099) within 30 days. The final models were based on ensemble learning that combined predictions from elastic net penalized logistic regression, random forest, gradient boosting, and a neural network. The area under the receiver operating characteristic (ROC) curves (AUCs) on the test set were 0.88 (95% confidence interval [CI] = 0.87-0.89) and 0.89 (95% CI = 0.88-0.90) for the outcome within 90 days and 30 days, respectively, both being significantly better than chance (i.e., AUC = 0.50) (p < 0.01). Sensitivity, specificity, and predictive values were reported at different risk thresholds. A limitation of our study is that our models have not yet been externally validated, and thus, the generalizability of the models to other populations remains unknown. CONCLUSIONS:By combining the ensemble method of multiple machine learning algorithms and high-quality data solely from the Swedish registers, we developed prognostic models to predict short-term suicide attempt/death with good discrimination and calibration. Whether novel predictors can improve predictive performance requires further investigation.

Project description:Background Carotid endarterectomy (CEA) is a major vascular operation for stroke prevention that carries significant perioperative risks; however, outcome prediction tools remain limited. The authors developed machine learning algorithms to predict outcomes following CEA. Methods and Results The National Surgical Quality Improvement Program targeted vascular database was used to identify patients who underwent CEA between 2011 and 2021. Input features included 36 preoperative demographic/clinical variables. The primary outcome was 30-day major adverse cardiovascular events (composite of stroke, myocardial infarction, or death). The data were split into training (70%) and test (30%) sets. Using 10-fold cross-validation, 6 machine learning models were trained using preoperative features. The primary metric for evaluating model performance was area under the receiver operating characteristic curve. Model robustness was evaluated with calibration plot and Brier score. Overall, 38 853 patients underwent CEA during the study period. Thirty-day major adverse cardiovascular events occurred in 1683 (4.3%) patients. The best performing prediction model was XGBoost, achieving an area under the receiver operating characteristic curve of 0.91 (95% CI, 0.90-0.92). In comparison, logistic regression had an area under the receiver operating characteristic curve of 0.62 (95% CI, 0.60-0.64), and existing tools in the literature demonstrate area under the receiver operating characteristic curve values ranging from 0.58 to 0.74. The calibration plot showed good agreement between predicted and observed event probabilities with a Brier score of 0.02. The strongest predictive feature in our algorithm was carotid symptom status. Conclusions The machine learning models accurately predicted 30-day outcomes following CEA using preoperative data and performed better than existing tools. They have potential for important utility in guiding risk-mitigation strategies to improve outcomes for patients being considered for CEA.

Project description:BackgroundPulmonary tuberculosis (PTB) is a prevalent chronic disease associated with a significant economic burden on patients. Using machine learning to predict hospitalization costs can allocate medical resources effectively and optimize the cost structure rationally, so as to control the hospitalization costs of patients better.MethodsThis research analyzed data (2020-2022) from a Kashgar pulmonary hospital's information system, involving 9570 eligible PTB patients. SPSS 26.0 was used for multiple regression analysis, while Python 3.7 was used for random forest regression (RFR) and MLP. The training set included data from 2020 and 2021, while the test set included data from 2022. The models predicted seven various costs related to PTB patients, including diagnostic cost, medical service cost, material cost, treatment cost, drug cost, other cost, and total hospitalization cost. The model's predictive performance was evaluated using R-square (R2), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) metrics.ResultsAmong the 9570 PTB patients included in the study, the median and quartile of total hospitalization cost were 13,150.45 (9891.34, 19,648.48) yuan. Nine factors, including age, marital status, admission condition, length of hospital stay, initial treatment, presence of other diseases, transfer, drug resistance, and admission department, significantly influenced hospitalization costs for PTB patients. Overall, MLP demonstrated superior performance in most cost predictions, outperforming RFR and multiple regression; The performance of RFR is between MLP and multiple regression; The predictive performance of multiple regression is the lowest, but it shows the best results for Other costs.ConclusionThe MLP can effectively leverage patient information and accurately predict various hospitalization costs, achieving a rationalized structure of hospitalization costs by adjusting higher-cost inpatient items and balancing different cost categories. The insights of this predictive model also hold relevance for research in other medical conditions.

Dataset Information

Predicting hospitalization following psychiatric crisis care using machine learning.

Publications

Predicting hospitalization following psychiatric crisis care using machine learning.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets