Project description:BackgroundDeep learning-based head and neck lymph node level (HN_LNL) autodelineation is of high relevance to radiotherapy research and clinical treatment planning but still underinvestigated in academic literature. In particular, there is no publicly available open-source solution for large-scale autosegmentation of HN_LNL in the research setting.MethodsAn expert-delineated cohort of 35 planning CTs was used for training of an nnU-net 3D-fullres/2D-ensemble model for autosegmentation of 20 different HN_LNL. A second cohort acquired at the same institution later in time served as the test set (n = 20). In a completely blinded evaluation, 3 clinical experts rated the quality of deep learning autosegmentations in a head-to-head comparison with expert-created contours. For a subgroup of 10 cases, intraobserver variability was compared to the average deep learning autosegmentation accuracy on the original and recontoured set of expert segmentations. A postprocessing step to adjust craniocaudal boundaries of level autosegmentations to the CT slice plane was introduced and the effect of autocontour consistency with CT slice plane orientation on geometric accuracy and expert rating was investigated.ResultsBlinded expert ratings for deep learning segmentations and expert-created contours were not significantly different. Deep learning segmentations with slice plane adjustment were rated numerically higher (mean, 81.0 vs. 79.6, p = 0.185) and deep learning segmentations without slice plane adjustment were rated numerically lower (77.2 vs. 79.6, p = 0.167) than manually drawn contours. In a head-to-head comparison, deep learning segmentations with CT slice plane adjustment were rated significantly better than deep learning contours without slice plane adjustment (81.0 vs. 77.2, p = 0.004). Geometric accuracy of deep learning segmentations was not different from intraobserver variability (mean Dice per level, 0.76 vs. 0.77, p = 0.307). Clinical significance of contour consistency with CT slice plane orientation was not represented by geometric accuracy metrics (volumetric Dice, 0.78 vs. 0.78, p = 0.703).ConclusionsWe show that a nnU-net 3D-fullres/2D-ensemble model can be used for highly accurate autodelineation of HN_LNL using only a limited training dataset that is ideally suited for large-scale standardized autodelineation of HN_LNL in the research setting. Geometric accuracy metrics are only an imperfect surrogate for blinded expert rating.
Project description:Background: Decision-making in epilepsy surgery is strongly connected to the interpretation of the intracranial EEG (iEEG). Although deep learning approaches have demonstrated efficiency in processing extracranial EEG, few studies have addressed iEEG seizure detection, in part due to the small number of seizures per patient typically available from intracranial investigations. This study aims to evaluate the efficiency of deep learning methodology in detecting iEEG seizures using a large dataset of ictal patterns collected from epilepsy patients implanted with a responsive neurostimulation system (RNS). Methods: Five thousand two hundred and twenty-six ictal events were collected from 22 patients implanted with RNS. A convolutional neural network (CNN) architecture was created to provide personalized seizure annotations for each patient. Accuracy of seizure identification was tested in two scenarios: patients with seizures occurring following a period of chronic recording (scenario 1) and patients with seizures occurring immediately following implantation (scenario 2). The accuracy of the CNN in identifying RNS-recorded iEEG ictal patterns was evaluated against human neurophysiology expertise. Statistical performance was assessed via the area-under-precision-recall curve (AUPRC). Results: In scenario 1, the CNN achieved a maximum mean binary classification AUPRC of 0.84 ± 0.19 (95%CI, 0.72-0.93) and mean regression accuracy of 6.3 ± 1.0 s (95%CI, 4.3-8.5 s) at 30 seed samples. In scenario 2, maximum mean AUPRC was 0.80 ± 0.19 (95%CI, 0.68-0.91) and mean regression accuracy was 6.3 ± 0.9 s (95%CI, 4.8-8.3 s) at 20 seed samples. We obtained near-maximum accuracies at seed size of 10 in both scenarios. CNN classification failures can be explained by ictal electro-decrements, brief seizures, single-channel ictal patterns, highly concentrated interictal activity, changes in the sleep-wake cycle, and progressive modulation of electrographic ictal features. Conclusions: We developed a deep learning neural network that performs personalized detection of RNS-derived ictal patterns with expert-level accuracy. These results suggest the potential for automated techniques to significantly improve the management of closed-loop brain stimulation, including during the initial period of recording when the device is otherwise naïve to a given patient's seizures.
Project description:Background: Diagnosis of skin diseases is often challenging and computer-aided diagnostic tools are urgently needed to underpin decision making. Objective: To develop a convolutional neural network model to classify clinically relevant selected multiple-lesion skin diseases, this in accordance to the STARD guidelines. Methods: This was an image-based retrospective study using multi-task learning for binary classification. A VGG-16 model was trained on 16,543 non-standardized images. Image data was distributed in training set (80%), validation set (10%), and test set (10%). All images were collected from a clinical database of a Danish population attending one dermatological department. Included was patients categorized with ICD-10 codes related to acne, rosacea, psoriasis, eczema, and cutaneous t-cell lymphoma. Results: Acne was distinguished from rosacea with a sensitivity of 85.42% CI 72.24-93.93% and a specificity of 89.53% CI 83.97-93.68%, cutaneous t-cell lymphoma was distinguished from eczema with a sensitivity of 74.29% CI 67.82-80.05% and a specificity of 84.09% CI 80.83-86.99%, and psoriasis from eczema with a sensitivity of 81.79% CI 78.51-84.76% and a specificity of 73.57% CI 69.76-77.13%. All results were based on the test set. Conclusion: The performance rates reported were equal or superior to those reported for general practitioners with dermatological training, indicating that computer-aided diagnostic models based on convolutional neural network may potentially be employed for diagnosing multiple-lesion skin diseases.
Project description:AimAs the completed studies have small sample sizes and different algorithms, a meta-analysis was conducted to assess the accuracy of WCE in identifying polyps using deep learning.MethodTwo independent reviewers searched PubMed, Embase, the Web of Science, and the Cochrane Library for potentially eligible studies published up to December 8, 2021, which were analysed on a per-image basis. STATA RevMan and Meta-DiSc were used to conduct this meta-analysis. A random effects model was used, and a subgroup and regression analysis was performed to explore sources of heterogeneity.ResultsEight studies published between 2017 and 2021 included 819 patients, and 18,414 frames were eventually included in the meta-analysis. The summary estimates for the WCE in identifying polyps by deep learning were sensitivity 0.97 (95% confidence interval (CI), 0.95-0.98); specificity 0.97 (95% CI, 0.94-0.98); positive likelihood ratio 27.19 (95% CI, 15.32-50.42); negative likelihood ratio 0.03 (95% CI 0.02-0.05); diagnostic odds ratio 873.69 (95% CI, 387.34-1970.74); and the area under the sROC curve 0.99.ConclusionWCE uses deep learning to identify polyps with high accuracy, but multicentre prospective randomized controlled studies are needed in the future.
Project description:BackgroundTherapeutic decisions for degenerative cervical myelopathy (DCM) are complex and should consider various factors. We aimed to develop machine learning (ML) models for classifying expert-level therapeutic decisions in patients with DCM.MethodsThis retrospective cross-sectional study included patients diagnosed with DCM, and the diagnosis of DCM was confirmed clinically and radiologically. The target outcomes were defined as conservative treatment, anterior surgical approaches (ASA), and posterior surgical approaches (PSA). We performed the following classifications using ML algorithms: multiclass, one-versus-rest, and one-versus-one. Two ensemble ML algorithms were used: random forest (RF) and extreme gradient boosting (XGB). The area under the receiver operating characteristic curve (AUC-ROC) was the primary metric. We also identified the variable importance for each classification.ResultsIn total, 304 patients were included (109 conservative, 66 ASA, 125 PSA, and 4 combined surgeries). For multiclass classification, the AUC-ROC of RF and XGB models were 0.91 and 0.92, respectively. In addition, ML models showed AUC-ROC values of >0.9 for all types of binary classifications. Variable importance analysis revealed that the modified Japanese Orthopaedic Association score and central motor conduction time were the two most important variables for distinguishing between conservative and surgical treatments. When classifying ASA and PSA, the number of involved levels, age, and body mass index were important contributing factors.ConclusionML-based classification of DCM therapeutic options is valid and feasible. This study can be a basis for establishing generalizable ML-based surgical decision models for DCM. Further studies are needed with a large multicenter database.
Project description:Develop a high-performing, automated sleep scoring algorithm that can be applied to long-term scalp electroencephalography (EEG) recordings. Using a clinical dataset of polysomnograms from 6,431 patients (MGH-PSG dataset), we trained a deep neural network to classify sleep stages based on scalp EEG data. The algorithm consists of a convolutional neural network for feature extraction, followed by a recurrent neural network that extracts temporal dependencies of sleep stages. The algorithm's inputs are four scalp EEG bipolar channels (F3-C3, C3-O1, F4-C4, and C4-O2), which can be derived from any standard PSG or scalp EEG recording. We initially trained the algorithm on the MGH-PSG dataset and used transfer learning to fine-tune it on a dataset of long-term (24-72 h) scalp EEG recordings from 112 patients (scalpEEG dataset). The algorithm achieved a Cohen's kappa of 0.74 on the MGH-PSG holdout testing set and cross-validated Cohen's kappa of 0.78 after optimization on the scalpEEG dataset. The algorithm also performed well on two publicly available PSG datasets, demonstrating high generalizability. Performance on all datasets was comparable to the inter-rater agreement of human sleep staging experts (Cohen's kappa ~ 0.75 ± 0.11). The algorithm's performance on long-term scalp EEGs was robust over a wide age range and across common EEG background abnormalities. We developed a deep learning algorithm that achieves human expert level sleep staging performance on long-term scalp EEG recordings. This algorithm, which we have made publicly available, greatly facilitates the use of large long-term EEG clinical datasets for sleep-related research.
Project description:Deep learning, a state-of-the-art machine learning approach, has shown outstanding performance over traditional machine learning in identifying intricate structures in complex high-dimensional data, especially in the domain of computer vision. The application of deep learning to early detection and automated classification of Alzheimer's disease (AD) has recently gained considerable attention, as rapid progress in neuroimaging techniques has generated large-scale multimodal neuroimaging data. A systematic review of publications using deep learning approaches and neuroimaging data for diagnostic classification of AD was performed. A PubMed and Google Scholar search was used to identify deep learning papers on AD published between January 2013 and July 2018. These papers were reviewed, evaluated, and classified by algorithm and neuroimaging type, and the findings were summarized. Of 16 studies meeting full inclusion criteria, 4 used a combination of deep learning and traditional machine learning approaches, and 12 used only deep learning approaches. The combination of traditional machine learning for classification and stacked auto-encoder (SAE) for feature selection produced accuracies of up to 98.8% for AD classification and 83.7% for prediction of conversion from mild cognitive impairment (MCI), a prodromal stage of AD, to AD. Deep learning approaches, such as convolutional neural network (CNN) or recurrent neural network (RNN), that use neuroimaging data without pre-processing for feature selection have yielded accuracies of up to 96.0% for AD classification and 84.2% for MCI conversion prediction. The best classification performance was obtained when multimodal neuroimaging and fluid biomarkers were combined. Deep learning approaches continue to improve in performance and appear to hold promise for diagnostic classification of AD using multimodal neuroimaging data. AD research that uses deep learning is still evolving, improving performance by incorporating additional hybrid data types, such as-omics data, increasing transparency with explainable approaches that add knowledge of specific disease-related features and mechanisms.
Project description:We propose a new method for the classification task of distinguishing atrial fibrillation (AFib) from regular atrial tachycardias including atrial flutter (AFlu) based on a surface electrocardiogram (ECG). Recently, many approaches for an automatic classification of cardiac arrhythmia were proposed and to our knowledge none of them can distinguish between these two. We discuss reasons why deep learning may not yield satisfactory results for this task. We generate new and clinically interpretable features using mathematical optimization for subsequent use within a machine learning (ML) model. These features are generated from the same input data by solving an additional regression problem with complicated combinatorial substructures. The resultant can be seen as a novel machine learning model that incorporates expert knowledge on the pathophysiology of atrial flutter. Our approach achieves an unprecedented accuracy of 82.84% and an area under the receiver operating characteristic (ROC) curve of 0.9, which classifies as "excellent" according to the classification indicator of diagnostic tests. One additional advantage of our approach is the inherent interpretability of the classification results. Our features give insight into a possibly occurring multilevel atrioventricular blocking mechanism, which may improve treatment decisions beyond the classification itself. Our research ideally complements existing textbook cardiac arrhythmia classification methods, which cannot provide a classification for the important case of AFib↔AFlu. The main contribution is the successful use of a novel mathematical model for multilevel atrioventricular block and optimization-driven inverse simulation to enhance machine learning for classification of the arguably most difficult cases in cardiac arrhythmia. A tailored Branch-and-Bound algorithm was implemented for the domain knowledge part, while standard algorithms such as Adam could be used for training.
Project description:Here we present miR-eCLIP analysis of AGO2 in HEK293 cells to address the small RNA repertoire and uncover their physiological targets. We developed an optimized bioinformatics approach of chimeric read identification to detect chimeras of high confidence, which were useed as an biologically validated input for miRBind, a deep learning method and web-server that can be used to accurately predict the potential of miRNA:target site binding.
Project description:Objectives:Scoring laboratory polysomnography (PSG) data remains a manual task of visually annotating 3 primary categories: sleep stages, sleep disordered breathing, and limb movements. Attempts to automate this process have been hampered by the complexity of PSG signals and physiological heterogeneity between patients. Deep neural networks, which have recently achieved expert-level performance for other complex medical tasks, are ideally suited to PSG scoring, given sufficient training data. Methods:We used a combination of deep recurrent and convolutional neural networks (RCNN) for supervised learning of clinical labels designating sleep stages, sleep apnea events, and limb movements. The data for testing and training were derived from 10 000 clinical PSGs and 5804 research PSGs. Results:When trained on the clinical dataset, the RCNN reproduces PSG diagnostic scoring for sleep staging, sleep apnea, and limb movements with accuracies of 87.6%, 88.2% and 84.7% on held-out test data, a level of performance comparable to human experts. The RCNN model performs equally well when tested on the independent research PSG database. Only small reductions in accuracy were noted when training on limited channels to mimic at-home monitoring devices: frontal leads only for sleep staging, and thoracic belt signals only for the apnea-hypopnea index. Conclusions:By creating accurate deep learning models for sleep scoring, our work opens the path toward broader and more timely access to sleep diagnostics. Accurate scoring automation can improve the utility and efficiency of in-lab and at-home approaches to sleep diagnostics, potentially extending the reach of sleep expertise beyond specialty clinics.