Dataset Information

Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians?

ABSTRACT: Machine learning can help clinicians to make individualized patient predictions only if researchers demonstrate models that contribute novel insights, rather than learning the most likely next step in a set of actions a clinician will take. We trained deep learning models using only clinician-initiated, administrative data for 42.9 million admissions using three subsets of data: demographic data only, demographic data and information available at admission, and the previous data plus charges recorded during the first day of admission. Models trained on charges during the first day of admission achieve performance close to published full EMR-based benchmarks for inpatient outcomes: inhospital mortality (0.89 AUC), prolonged length of stay (0.82 AUC), and 30-day readmission rate (0.71 AUC). Similar performance between models trained with only clinician-initiated data and those trained with full EMR data purporting to include information about patient state and physiology should raise concern in the deployment of these models. Furthermore, these models exhibited significant declines in performance when evaluated over only myocardial infarction (MI) patients relative to models trained over MI patients alone, highlighting the importance of physician diagnosis in the prognostic performance of these models. These results provide a benchmark for predictive accuracy trained only on prior clinical actions and indicate that models with similar performance may derive their signal by looking over clinician's shoulders-using clinical behavior as the expression of preexisting intuition and suspicion to generate a prediction. For models to guide clinicians in individual decisions, performance exceeding these benchmarks is necessary.

SUBMITTER: Beaulieu-Jones BK

PROVIDER: S-EPMC8010071 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:Diabetes mellitus is a group of metabolic diseases in which blood sugar levels are too high. About 8.8% of the world was diabetic in 2017. It is projected that this will reach nearly 10% by 2045. The major challenge is that when machine learning-based classifiers are applied to such data sets for risk stratification, leads to lower performance. Thus, our objective is to develop an optimized and robust machine learning (ML) system under the assumption that missing values or outliers if replaced by a median configuration will yield higher risk stratification accuracy. This ML-based risk stratification is designed, optimized and evaluated, where: (i) the features are extracted and optimized from the six feature selection techniques (random forest, logistic regression, mutual information, principal component analysis, analysis of variance, and Fisher discriminant ratio) and combined with ten different types of classifiers (linear discriminant analysis, quadratic discriminant analysis, naïve Bayes, Gaussian process classification, support vector machine, artificial neural network, Adaboost, logistic regression, decision tree, and random forest) under the hypothesis that both missing values and outliers when replaced by computed medians will improve the risk stratification accuracy. Pima Indian diabetic dataset (768 patients: 268 diabetic and 500 controls) was used. Our results demonstrate that on replacing the missing values and outliers by group median and median values, respectively and further using the combination of random forest feature selection and random forest classification technique yields an accuracy, sensitivity, specificity, positive predictive value, negative predictive value and area under the curve as: 92.26%, 95.96%, 79.72%, 91.14%, 91.20%, and 0.93, respectively. This is an improvement of 10% over previously developed techniques published in literature. The system was validated for its stability and reliability. RF-based model showed the best performance when outliers are replaced by median values.

Project description:BackgroundPapillary thyroid cancer (PTC) is one of the most common endocrine malignancies with different risk levels. However, preoperative risk assessment of PTC is still a challenge in the worldwide. Here, we first report a Preoperative Risk Assessment Classifier for PTC (PRAC-PTC) by multidimensional features including clinical indicators, immune indices, genetic feature, and proteomics.Materials and methodsThe 558 patients collected from June 2013 to November 2020 were allocated to three groups: discovery set (274 patients, 274 FFPE), retrospective test set (166 patients, 166 FFPE) and prospective test set (118 patients, 118 FNA). Proteomic profiling was conducted by formalin-fixed paraffin-embedded (FFPE) and fine-needle aspiration (FNA) tissues from the patients. Preoperative clinical information and blood immunological indices were collected. The BRAFV600E mutation were detected by the amplification refractory mutation system (ARMS).ResultsWe developed a machine learning model of 17 variables based on multidimensional features of 274 PTC patients from a retrospective cohort. The PRAC-PTC achieved areas under the curve (AUC) of 0.925 in the discovery set and validated externally by blinded analyses in a retrospective cohort of 166 PTC patients (0.787 AUC) and a prospective cohort of 118 PTC patients (0.799 AUC) from two independent clinical centres. Meanwhile, the preoperative predictive risk effectiveness of clinicians was improved with the assistance of PRAC-PTC, and the accuracies reached at 84.4% (95% CI 82.9-84.4) and 83.5% (95% CI 82.2-84.2) in the retrospective and prospective test sets, respectively.ConclusionThis study demonstrated that the PRAC-PTC that integrating clinical data, gene mutation information, immune indices, high-throughput proteomics and machine learning technology in multi-centre retrospective and prospective clinical cohorts can effectively stratify the preoperative risk of PTC and may decrease unnecessary surgery or overtreatment.

Project description:Importance:Thyroid nodules are common incidental findings. Ultrasonography and molecular testing can be used to assess risk of malignant neoplasm. Objective:To examine whether a model developed through automated machine learning can stratify thyroid nodules as high or low genetic risk by ultrasonography imaging alone compared with stratification by molecular testing for high- and low-risk mutations. Design, Setting, and Participants:This diagnostic study was conducted at a single tertiary care urban academic institution and included patients (n?=?121) who underwent ultrasonography and molecular testing for thyroid nodules from January 1, 2017, through August 1, 2018. Nodules were classified as high risk or low risk on the basis of results of an institutional molecular testing panel for thyroid risk genes. All thyroid nodules that underwent genetic sequencing for cytological results with Bethesda System categories III and IV were reviewed. Patients without diagnostic ultrasonographic images within 6 months of fine-needle aspiration or who received definitive treatment at an outside medical center were excluded. Main Outcomes and Measures:Thyroid nodules were categorized by the model as high risk or low risk using ultrasonographic images. Results were compared using genetic testing. Results:Among the 134 lesions identified in 121 patients (mean [SD] age, 55.7 [14.2] years; 102 women [84.3%]), 683 diagnostic ultrasonographic images were selected. Of the 683 images, 556 (81.4%) were used for training the model, 74 (10.8%) for validation, and 53 (7.8%) for testing. Most nodules had no mutation (75 [56.0%]), whereas 43 nodules (32.1%) had a high-risk mutation and 16 (11.9%) had an unknown or a low-risk mutation (?2?=?39.060; P?<?.001). In total, 228 images (33.4%) were of nodules classified as genetically high risk (n?=?43), and 455 (66.6%) were of low-risk nodules (n?=?91). The model performed with a sensitivity of 45% (95% CI,?23.1%-68.5%), a specificity of 97% (95% CI,?84.2%-99.9%), a positive predictive value of 90% (95% CI,?55.2%-98.5%), a negative predictive value of 74.4% (95% CI,?66.1%-81.3%), and an overall accuracy of 77.4% (95% CI,?63.8%-97.7%). Conclusions and Relevance:The study found that the model developed through automated machine learning could produce high specificity for identifying nodules with high-risk mutations on molecular testing. This finding shows promise for the diagnostic applications of machine learning interpretation of sonographic imaging of indeterminate thyroid nodules.

Dataset Information

Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians?

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets