Project description:BackgroundThe coronavirus disease (COVID-19) pandemic has spread rapidly across the world, creating an urgent need for predictive models that can help healthcare providers prepare and respond to outbreaks more quickly and effectively, and ultimately improve patient care. Early detection and warning systems are crucial for preventing and controlling epidemic spread.ObjectiveIn this study, we aimed to propose a machine learning-based method to predict the transmission trend of COVID-19 and a new approach to detect the start time of new outbreaks by analyzing epidemiological data.MethodsWe developed a risk index to measure the change in the transmission trend. We applied machine learning (ML) techniques to predict COVID-19 transmission trends, categorized into three labels: decrease (L0), maintain (L1), and increase (L2). We used Support Vector Machine (SVM), Random Forest (RF), and XGBoost (XGB) as ML models. We employed grid search methods to determine the optimal hyperparameters for these three models. We proposed a new method to detect the start time of new outbreaks based on label 2, which was sustained for at least 14 days (i.e., the duration of maintenance). We compared the performance of different ML models to identify the most accurate approach for outbreak detection. We conducted sensitivity analysis for the duration of maintenance between 7 days and 28 days.ResultsML methods demonstrated high accuracy (over 94%) in estimating the classification of the transmission trends. Our proposed method successfully predicted the start time of new outbreaks, enabling us to detect a total of seven estimated outbreaks, while there were five reported outbreaks between March 2020 and October 2022 in Korea. It means that our method could detect minor outbreaks. Among the ML models, the RF and XGB classifiers exhibited the highest accuracy in outbreak detection.ConclusionThe study highlights the strength of our method in accurately predicting the timing of an outbreak using an interpretable and explainable approach. It could provide a standard for predicting the start time of new outbreaks and detecting future transmission trends. This method can contribute to the development of targeted prevention and control measures and enhance resource management during the pandemic.
Project description:Background:The number of cases from the coronavirus disease 2019 (COVID-19) global pandemic has overwhelmed existing medical facilities and forced clinicians, patients, and families to make pivotal decisions with limited time and information. Main body:While machine learning (ML) methods have been previously used to augment clinical decisions, there is now a demand for "Emergency ML." Throughout the patient care pathway, there are opportunities for ML-supported decisions based on collected vitals, laboratory results, medication orders, and comorbidities. With rapidly growing datasets, there also remain important considerations when developing and validating ML models. Conclusion:This perspective highlights the utility of evidence-based prediction tools in a number of clinical settings, and how similar models can be deployed during the COVID-19 pandemic to guide hospital frontlines and healthcare administrators to make informed decisions about patient care and managing hospital volume.
Project description:The contact and interaction of human is considered to be one of the important factors affecting the epidemic transmission, and it is critical to model the heterogeneity of individual activities in epidemiological risk assessment. In digital society, massive data makes it possible to implement this idea on large scale. Here, we use the mobile phone signaling to track the users' trajectories and construct contact network to describe the topology of daily contact between individuals dynamically. We show the spatiotemporal contact features of about 7.5 million mobile phone users during the outbreak of COVID-19 in Shanghai, China. Furthermore, the individual feature matrix extracted from contact network enables us to carry out the extreme event learning and predict the regional transmission risk, which can be further decomposed into the risk due to the inflow of people from epidemic hot zones and the risk due to people close contacts within the observing area. This method is much more flexible and adaptive, and can be taken as one of the epidemic precautions before the large-scale outbreak with high efficiency and low cost.
Project description:OBJECTIVE:To propose nonparametric ensemble machine learning for mental health and substance use disorders (MHSUD) spending risk adjustment formulas, including considering Clinical Classification Software (CCS) categories as diagnostic covariates over the commonly used Hierarchical Condition Category (HCC) system. DATA SOURCES:2012-2013 Truven MarketScan database. STUDY DESIGN:We implement 21 algorithms to predict MHSUD spending, as well as a weighted combination of these algorithms called super learning. The algorithm collection included seven unique algorithms that were supplied with three differing sets of MHSUD-related predictors alongside demographic covariates: HCC, CCS, and HCC + CCS diagnostic variables. Performance was evaluated based on cross-validated R2 and predictive ratios. PRINCIPAL FINDINGS:Results show that super learning had the best performance based on both metrics. The top single algorithm was random forests, which improved on ordinary least squares regression by 10 percent with respect to relative efficiency. CCS categories-based formulas were generally more predictive of MHSUD spending compared to HCC-based formulas. CONCLUSIONS:Literature supports the potential benefit of implementing a separate MHSUD spending risk adjustment formula. Our results suggest there is an incentive to explore machine learning for MHSUD-specific risk adjustment, as well as considering CCS categories over HCCs.
Project description:Several (inter)national longitudinal dementia observational datasets encompassing demographic information, neuroimaging, biomarkers, neuropsychological evaluations, and muti-omics data, have ushered in a new era of potential for integrating machine learning (ML) into dementia research and clinical practice. ML, with its proficiency in handling multi-modal and high-dimensional data, has emerged as an innovative technique to facilitate early diagnosis, differential diagnosis, and to predict onset and progression of mild cognitive impairment and dementia. In this review, we evaluate current and potential applications of ML, including its history in dementia research, how it compares to traditional statistics, the types of datasets it uses and the general workflow. Moreover, we identify the technical barriers and challenges of ML implementations in clinical practice. Overall, this review provides a comprehensive understanding of ML with non-technical explanations for broader accessibility to biomedical scientists and clinicians.
Project description:Machine learning (ML) models have proven their potential in acquiring and analyzing large amounts of data to help solve real-world, complex problems. Their use in healthcare is expected to help physicians make diagnoses, prognoses, treatment decisions, and disease outcome predictions. However, ML solutions are not currently deployed in most healthcare systems. One of the main reasons for this is the provenance, transparency, and clinical utility of the training data. Physicians reject ML solutions if they are not at least based on accurate data and do not clearly include the decision-making process used in clinical practice. In this paper, we present a hybrid human-machine intelligence method to create predictive models driven by clinical practice. We promote the use of quality-approved data and the inclusion of physician reasoning in the ML process. Instead of training the ML algorithms on the given data to create predictive models (conventional method), we propose to pre-categorize the data according to the expert physicians' knowledge and experience. Comparing the results of the conventional method of ML learning versus the hybrid physician-algorithm method showed that the models based on the latter can perform better. Physicians' engagement is the most promising condition for the safe and innovative use of ML in healthcare.
Project description:ImportanceWhile machine learning approaches may enhance prediction ability, little is known about their utility in emergency department (ED) triage.ObjectivesTo examine the performance of machine learning approaches to predict clinical outcomes and disposition in children in the ED and to compare their performance with conventional triage approaches.Design, setting, and participantsPrognostic study of ED data from the National Hospital Ambulatory Medical Care Survey from January 1, 2007, through December 31, 2015. A nationally representative sample of 52 037 children aged 18 years or younger who presented to the ED were included. Data analysis was performed in August 2018.Main outcomes and measuresThe outcomes were critical care (admission to an intensive care unit and/or in-hospital death) and hospitalization (direct hospital admission or transfer). In the training set (70% random sample), using routinely available triage data as predictors (eg, demographic characteristics and vital signs), we derived 4 machine learning-based models: lasso regression, random forest, gradient-boosted decision tree, and deep neural network. In the test set (the remaining 30% of the sample), we measured the models' prediction performance by computing C statistics, prospective prediction results, and decision curves. These machine learning models were built for each outcome and compared with the reference model using the conventional triage classification information.ResultsOf 52 037 eligible ED visits by children (median [interquartile range] age, 6 [2-14] years; 24 929 [48.0%] female), 163 (0.3%) had the critical care outcome and 2352 (4.5%) had the hospitalization outcome. For the critical care prediction, all machine learning approaches had higher discriminative ability compared with the reference model, although the difference was not statistically significant (eg, C statistics of 0.85 [95% CI, 0.78-0.92] for the deep neural network vs 0.78 [95% CI, 0.71-0.85] for the reference; P = .16), and lower number of undertriaged critically ill children in the conventional triage levels 3 to 5 (urgent to nonurgent). For the hospitalization prediction, all machine learning approaches had significantly higher discrimination ability (eg, C statistic, 0.80 [95% CI, 0.78-0.81] for the deep neural network vs 0.73 [95% CI, 0.71-0.75] for the reference; P < .001) and fewer overtriaged children who did not require inpatient management in the conventional triage levels 1 to 3 (immediate to urgent). The decision curve analysis demonstrated a greater net benefit of machine learning models over ranges of clinical thresholds.Conclusions and relevanceMachine learning-based triage had better discrimination ability to predict clinical outcomes and disposition, with reduction in undertriaging critically ill children and overtriaging children who are less ill.
Project description:Gene expression profiles were generated from 199 primary breast cancer patients. Samples 1-176 were used in another study, GEO Series GSE22820, and form the training data set in this study. Sample numbers 200-222 form a validation set. This data is used to model a machine learning classifier for Estrogen Receptor Status. RNA was isolated from 199 primary breast cancer patients. A machine learning classifier was built to predict ER status using only three gene features.
Project description:Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine1,2. Patients with leukaemia can be identified using machine learning on the basis of their blood transcriptomes3. However, there is an increasing divide between what is technically possible and what is allowed, because of privacy legislation4,5. Here, to facilitate the integration of any medical data from any data owner worldwide without violating privacy laws, we introduce Swarm Learning-a decentralized machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning. To illustrate the feasibility of using Swarm Learning to develop disease classifiers using distributed data, we chose four use cases of heterogeneous diseases (COVID-19, tuberculosis, leukaemia and lung pathologies). With more than 16,400 blood transcriptomes derived from 127 clinical studies with non-uniform distributions of cases and controls and substantial study biases, as well as more than 95,000 chest X-ray images, we show that Swarm Learning classifiers outperform those developed at individual sites. In addition, Swarm Learning completely fulfils local confidentiality regulations by design. We believe that this approach will notably accelerate the introduction of precision medicine.
Project description:ObjectivesDevelop and implement a machine learning algorithm to predict severe sepsis and septic shock and evaluate the impact on clinical practice and patient outcomes.DesignRetrospective cohort for algorithm derivation and validation, pre-post impact evaluation.SettingTertiary teaching hospital system in Philadelphia, PA.PatientsAll non-ICU admissions; algorithm derivation July 2011 to June 2014 (n = 162,212); algorithm validation October to December 2015 (n = 10,448); silent versus alert comparison January 2016 to February 2017 (silent n = 22,280; alert n = 32,184).InterventionsA random-forest classifier, derived and validated using electronic health record data, was deployed both silently and later with an alert to notify clinical teams of sepsis prediction.Measurement and main resultPatients identified for training the algorithm were required to have International Classification of Diseases, 9th Edition codes for severe sepsis or septic shock and a positive blood culture during their hospital encounter with either a lactate greater than 2.2 mmol/L or a systolic blood pressure less than 90 mm Hg. The algorithm demonstrated a sensitivity of 26% and specificity of 98%, with a positive predictive value of 29% and positive likelihood ratio of 13. The alert resulted in a small statistically significant increase in lactate testing and IV fluid administration. There was no significant difference in mortality, discharge disposition, or transfer to ICU, although there was a reduction in time-to-ICU transfer.ConclusionsOur machine learning algorithm can predict, with low sensitivity but high specificity, the impending occurrence of severe sepsis and septic shock. Algorithm-generated predictive alerts modestly impacted clinical measures. Next steps include describing clinical perception of this tool and optimizing algorithm design and delivery.