Project description:BackgroundCoronavirus disease 2019 (COVID-19) is still ongoing spreading globally, machine learning techniques were used in disease diagnosis and to predict treatment outcomes, which showed favorable performance. The present study aims to predict COVID-19 severity at admission by different machine learning techniques including random forest (RF), support vector machine (SVM), and logistic regression (LR). Feature importance to COVID-19 severity were further identified.MethodsA retrospective design was adopted in the JinYinTan Hospital from January 26 to March 28, 2020, eighty-six demographic, clinical, and laboratory features were selected with LassoCV method, Spearman's rank correlation, experts' opinions, and literature evaluation. RF, SVM, and LR were performed to predict severe COVID-19, the performance of the models was compared by the area under curve (AUC). Additionally, feature importance to COVID-19 severity were analyzed by the best performance model.ResultsA total of 287 patients were enrolled with 36.6% severe cases and 63.4% non-severe cases. The median age was 60.0 years (interquartile range: 49.0-68.0 years). Three models were established using 23 features including 1 clinical, 1 chest computed tomography (CT) and 21 laboratory features. Among three models, RF yielded better overall performance with the highest AUC of 0.970 than SVM of 0.948 and LR of 0.928, RF also achieved a favorable sensitivity of 96.7%, specificity of 69.5%, and accuracy of 84.5%. SVM had sensitivity of 93.9%, specificity of 79.0%, and accuracy of 88.5%. LR also achieved a favorable sensitivity of 92.3%, specificity of 72.3%, and accuracy of 85.2%. Additionally, chest-CT had highest importance to illness severity, and the following features were neutrophil to lymphocyte ratio, lactate dehydrogenase, and D-dimer, respectively.ConclusionsOur results indicated that RF could be a useful predictive tool to identify patients with severe COVID-19, which may facilitate effective care and further optimize resources.
Project description:Since the onset of the COVID-19 pandemic, increasing cases with variable outcomes continue globally because of variants and despite vaccines and therapies. There is a need to identify at-risk individuals early that would benefit from timely medical interventions. DNA methylation provides an opportunity to identify an epigenetic signature of individuals at increased risk. We utilized machine learning to identify DNA methylation signatures of COVID-19 disease from data available through NCBI Gene Expression Omnibus. A training cohort of 460 individuals (164 COVID-19-infected and 296 non-infected) and an external validation dataset of 128 individuals (102 COVID-19-infected and 26 non-COVID-associated pneumonia) were reanalyzed. Data was processed using ChAMP and beta values were logit transformed. The JADBio AutoML platform was leveraged to identify a methylation signature associated with severe COVID-19 disease. We identified a random forest classification model from 4 unique methylation sites with the power to discern individuals with severe COVID-19 disease. The average area under the curve of receiver operator characteristic (AUC-ROC) of the model was 0.933 and the average area under the precision-recall curve (AUC-PRC) was 0.965. When applied to our external validation, this model produced an AUC-ROC of 0.898 and an AUC-PRC of 0.864. These results further our understanding of the utility of DNA methylation in COVID-19 disease pathology and serve as a platform to inform future COVID-19 related studies.
Project description:The oral mucosa is the first site of SARS-CoV-2 entry and replication, and it plays a central role in the early defense against infection. Thus, SARS-CoV-2 viral load, miRNAs, cytokines, and neutralizing activity (NA) were assessed in saliva and plasma from mild (MD) and severe (SD) COVID-19 patients. Here we show that of the 84 miRNAs analysed, 8 are differently express in plasma and saliva of SD. In particular: 1) miRNAs let-7a-5p, let-7b-5p, let-7c-5p are significantly downregulated; and 2) miR-23a and b, miR-29c, as well as three immunomodulatory miRNAs (miR-34a-5p, miR-181d-5p, miR-146) are significantly upregulated. The production of pro-inflammatory cytokines (IL-1β, IL-2, IL-6, IL-8, IL-9 and TNFα) and chemokines (CCL2 and RANTES) increase in both saliva and plasma of SD and MD. Notably, disease severity correlates with NA and immune activation. Monitoring these parameters could help to predict disease outcome and identify new markers of disease progression.
Project description:The oral mucosa is the first site of SARS-CoV-2 entry and replication, and it plays a central role in the early defense against infection. Thus, SARS-CoV-2 viral load, miRNAs, cytokines, and neutralizing activity (NA) were assessed in saliva and plasma from mild (MD) and severe (SD) COVID-19 patients. Here we show that of the 84 miRNAs analysed, 8 are differently express in plasma and saliva of SD. In particular: 1) miRNAs let-7a-5p, let-7b-5p, let-7c-5p are significantly downregulated; and 2) miR-23a and b, miR-29c, as well as three immunomodulatory miRNAs (miR-34a-5p, miR-181d-5p, miR-146) are significantly upregulated. The production of pro-inflammatory cytokines (IL-1β, IL-2, IL-6, IL-8, IL-9 and TNFα) and chemokines (CCL2 and RANTES) increase in both saliva and plasma of SD and MD. Notably, disease severity correlates with NA and immune activation. Monitoring these parameters could help to predict disease outcome and identify new markers of disease progression.
Project description:BackgroundAccurately predicting outcomes for cancer patients with COVID-19 has been clinically challenging. Numerous clinical variables have been retrospectively associated with disease severity, but the predictive value of these variables, and how multiple variables interact to increase risk, remains unclear.MethodsWe used machine learning algorithms to predict COVID-19 severity in 348 cancer patients at Memorial Sloan Kettering Cancer Center in New York City. Using only clinical variables collected on or before a patient's COVID-19 positive date (time zero), we sought to classify patients into one of three possible future outcomes: Severe-early (the patient required high levels of oxygen support within 3 days of being tested positive for COVID-19), Severe-late (the patient required high levels of oxygen after 3 days), and Non-severe (the patient never required oxygen support).ResultsOur algorithm classified patients into these classes with an area under the receiver operating characteristic curve (AUROC) ranging from 70 to 85%, significantly outperforming prior methods and univariate analyses. Critically, classification accuracy is highest when using a potpourri of clinical variables - including basic patient information, pre-existing diagnoses, laboratory and radiological work, and underlying cancer type - suggesting that COVID-19 in cancer patients comes with numerous, combinatorial risk factors.ConclusionsOverall, we provide a computational tool that can identify high-risk patients early in their disease progression, which could aid in clinical decision-making and selecting treatment options.
Project description:BackgroundControlling the COVID-19 outbreak in Brazil is a challenge due to the population's size and urban density, inefficient maintenance of social distancing and testing strategies, and limited availability of testing resources.ObjectiveThe purpose of this study is to effectively prioritize patients who are symptomatic for testing to assist early COVID-19 detection in Brazil, addressing problems related to inefficient testing and control strategies.MethodsRaw data from 55,676 Brazilians were preprocessed, and the chi-square test was used to confirm the relevance of the following features: gender, health professional, fever, sore throat, dyspnea, olfactory disorders, cough, coryza, taste disorders, and headache. Classification models were implemented relying on preprocessed data sets; supervised learning; and the algorithms multilayer perceptron (MLP), gradient boosting machine (GBM), decision tree (DT), random forest (RF), extreme gradient boosting (XGBoost), k-nearest neighbors (KNN), support vector machine (SVM), and logistic regression (LR). The models' performances were analyzed using 10-fold cross-validation, classification metrics, and the Friedman and Nemenyi statistical tests. The permutation feature importance method was applied for ranking the features used by the classification models with the highest performances.ResultsGender, fever, and dyspnea were among the highest-ranked features used by the classification models. The comparative analysis presents MLP, GBM, DT, RF, XGBoost, and SVM as the highest performance models with similar results. KNN and LR were outperformed by the other algorithms. Applying the easy interpretability as an additional comparison criterion, the DT was considered the most suitable model.ConclusionsThe DT classification model can effectively (with a mean accuracy≥89.12%) assist COVID-19 test prioritization in Brazil. The model can be applied to recommend the prioritizing of a patient who is symptomatic for COVID-19 testing.
Project description:Individuals with the SARS-CoV-2 infection may experience a wide range of symptoms, from being asymptomatic to having a mild fever and cough to a severe respiratory impairment that results in death. MicroRNA (miRNA), which plays a role in the antiviral effects of SARS-CoV-2 infection, has the potential to be used as a novel marker to distinguish between patients who have various COVID-19 clinical severities. In the current study, the existing blood expression profiles reported in two previous studies were combined for deep analyses. The final profiles contained 1444 miRNAs in 375 patients from six categories, which were as follows: 30 patients with mild COVID-19 symptoms, 81 patients with moderate COVID-19 symptoms, 30 non-COVID-19 patients with mild symptoms, 137 patients with severe COVID-19 symptoms, 31 non-COVID-19 patients with severe symptoms, and 66 healthy controls. An efficient computational framework containing four feature selection methods (LASSO, LightGBM, MCFS, and mRMR) and four classification algorithms (DT, KNN, RF, and SVM) was designed to screen clinical miRNA markers, and a high-precision RF model with a 0.780 weighted F1 was constructed. Some miRNAs, including miR-24-3p, whose differential expression was discovered in patients with acute lung injury complications brought on by severe COVID-19, and miR-148a-3p, differentially expressed against SARS-CoV-2 structural proteins, were identified, thereby suggesting the effectiveness and accuracy of our framework. Meanwhile, we extracted classification rules based on the DT model for the quantitative representation of the role of miRNA expression in differentiating COVID-19 patients with different severities. The search for novel biomarkers that could predict the severity of the disease could aid in the clinical diagnosis of COVID-19 and in exploring the specific mechanisms of the complications caused by SARS-CoV-2 infection. Moreover, new therapeutic targets for the disease may be found.
Project description:Infection with SARS-CoV-2 has highly variable clinical manifestations, ranging from asymptomatic infection through to life-threatening disease. Host whole blood transcriptomics can offer unique insights into the biological processes underpinning infection and disease, as well as severity. We performed whole blood RNA-Sequencing of individuals with varying degrees of COVID-19 severity. We used differential expression analysis and pathway enrichment analysis to explore how the blood transcriptome differs between individuals with mild, moderate, and severe COVID-19, performing pairwise comparisons between groups.
Project description:The Covid-19 European outbreak in February 2020 has challenged the world's health systems, eliciting an urgent need for effective and highly reliable diagnostic instruments to help medical personnel. Deep learning (DL) has been demonstrated to be useful for diagnosis using both computed tomography (CT) scans and chest X-rays (CXR), whereby the former typically yields more accurate results. However, the pivoting function of a CT scan during the pandemic presents several drawbacks, including high cost and cross-contamination problems. Radiation-free lung ultrasound (LUS) imaging, which requires high expertise and is thus being underutilised, has demonstrated a strong correlation with CT scan results and a high reliability in pneumonia detection even in the early stages. In this study, we developed a system based on modern DL methodologies in close collaboration with Fondazione IRCCS Policlinico San Matteo's Emergency Department (ED) of Pavia. Using a reliable dataset comprising ultrasound clips originating from linear and convex probes in 2908 frames from 450 hospitalised patients, we conducted an investigation into detecting Covid-19 patterns and ranking them considering two severity scales. This study differs from other research projects by its novel approach involving four and seven classes. Patients admitted to the ED underwent 12 LUS examinations in different chest parts, each evaluated according to standardised severity scales. We adopted residual convolutional neural networks (CNNs), transfer learning, and data augmentation techniques. Hence, employing methodological hyperparameter tuning, we produced state-of-the-art results meeting F1 score levels, averaged over the number of classes considered, exceeding 98%, and thereby manifesting stable measurements over precision and recall.