Dataset Information

A Deep Artificial Neural Network-Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation.

ABSTRACT: BACKGROUND:Coding of underlying causes of death from death certificates is a process that is nowadays undertaken mostly by humans with potential assistance from expert systems, such as the Iris software. It is, consequently, an expensive process that can, in addition, suffer from geospatial discrepancies, thus severely impairing the comparability of death statistics at the international level. The recent advances in artificial intelligence, specifically the rise of deep learning methods, has enabled computers to make efficient decisions on a number of complex problems that were typically considered out of reach without human assistance; they require a considerable amount of data to learn from, which is typically their main limiting factor. However, the CépiDc (Centre d'épidémiologie sur les causes médicales de Décès) stores an exhaustive database of death certificates at the French national scale, amounting to several millions of training examples available for the machine learning practitioner. OBJECTIVE:This article investigates the application of deep neural network methods to coding underlying causes of death. METHODS:The investigated dataset was based on data contained from every French death certificate from 2000 to 2015, containing information such as the subject's age and gender, as well as the chain of events leading to his or her death, for a total of around 8 million observations. The task of automatically coding the subject's underlying cause of death was then formulated as a predictive modelling problem. A deep neural network-based model was then designed and fit to the dataset. Its error rate was then assessed on an exterior test dataset and compared to the current state-of-the-art (ie, the Iris software). Statistical significance of the proposed approach's superiority was assessed via bootstrap. RESULTS:The proposed approach resulted in a test accuracy of 97.8% (95% CI 97.7-97.9), which constitutes a significant improvement over the current state-of-the-art and its accuracy of 74.5% (95% CI 74.0-75.0) assessed on the same test example. Such an improvement opens up a whole field of new applications, from nosologist-level batch-automated coding to international and temporal harmonization of cause of death statistics. A typical example of such an application is demonstrated by recoding French overdose-related deaths from 2000 to 2010. CONCLUSIONS:This article shows that deep artificial neural networks are perfectly suited to the analysis of electronic health records and can learn a complex set of medical rules directly from voluminous datasets, without any explicit prior knowledge. Although not entirely free from mistakes, the derived algorithm constitutes a powerful decision-making tool that is able to handle structured medical data with an unprecedented performance. We strongly believe that the methods developed in this article are highly reusable in a variety of settings related to epidemiology, biostatistics, and the medical sciences in general.

SUBMITTER: Falissard L

PROVIDER: S-EPMC7218605 | biostudies-literature | 2020 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A Deep Artificial Neural Network-Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation.

Falissard Louis L Morgand Claire C Roussel Sylvie S Imbaud Claire C Ghosn Walid W Bounebache Karim K Rey Grégoire G

JMIR medical informatics 20200428 4

<h4>Background</h4>Coding of underlying causes of death from death certificates is a process that is nowadays undertaken mostly by humans with potential assistance from expert systems, such as the Iris software. It is, consequently, an expensive process that can, in addition, suffer from geospatial discrepancies, thus severely impairing the comparability of death statistics at the international level. The recent advances in artificial intelligence, specifically the rise of deep learning methods, ...[more]

PMID: 32343252

Similar Datasets

Project description:BackgroundConservative management of adnexal mass is warranted when there is imaging-based and clinical evidence of benign characteristics. Malignancy risk is, however, a concern due to the mortality rate of ovarian cancer. Malignancy occurs in 10-15% of adnexal masses that go to surgery, whereas the rate of malignancy is much lower in masses clinically characterized as benign or indeterminate. Additional diagnostic tests could assist conservative management of these patients. Here we report the clinical validation of OvaWatch, a multivariate index assay, with real-world evidence of performance that supports conservative management of adnexal masses.MethodsOvaWatch utilizes a previously characterized neural network-based algorithm combining serum biomarkers and clinical covariates and was used to examine malignancy risk in prospective and retrospective samples of patients with an adnexal mass. Retrospective data sets were assembled from previous studies using patients who had adnexal mass and were scheduled for surgery. The prospective study was a multi-center trial of women with adnexal mass as identified on clinical examination and indeterminate or asymptomatic by imaging. The performance to detect ovarian malignancy was evaluated at a previously validated score threshold.ResultsIn retrospective, low prevalence (N = 1,453, 1.5% malignancy rate) data from patients that received an independent physician assessment of benign, OvaWatch has a sensitivity of 81.8% [95% confidence interval (CI) 65.1-92.7] for identifying a histologically confirmed malignancy, and a negative predictive value (NPV) of 99.7%. OvaWatch identified 18/22 malignancies missed by physician assessment. A prospective data set had 501 patients where 106 patients with adnexal mass went for surgery. The prevalence was 2% (10 malignancies). The sensitivity of OvaWatch for malignancy was 40% (95% CI: 16.8-68.7%), and the specificity was 87% (95% CI: 83.7-89.7) when patients were included in the analysis who did not go to surgery and were evaluated as benign. The NPV remained 98.6% (95% CI: 97.0-99.4%). An independent analysis set with a high prevalence (45.8%) the NPV value was 87.8% (95% CI: 95% CI: 75.8-94.3%).ConclusionOvaWatch demonstrated high NPV across diverse data sets and promises utility as an effective diagnostic test supporting management of suspected benign or indeterminate mass to safely decrease or delay unnecessary surgeries.

Project description:Background/aimsRandomized controlled trials frequently use death review committees to assign a cause of death rather than relying on cause of death information from death certificates. The National Lung Screening Trial, a randomized controlled trial of lung cancer screening with low-dose computed tomography versus chest X-ray for heavy and/or long-term smokers ages 55-74 years at enrollment, used a committee blinded to arm assignment for a subset of deaths to determine whether cause of death was due to lung cancer.MethodsDeaths were selected for review using a pre-determined computerized algorithm. The algorithm, which considered cancers diagnosed during the trial, causes and significant conditions listed on the death certificate, and the underlying cause of death derived from death certificate information by trained nosologists, selected deaths that were most likely to represent a death due to lung cancer (either directly or indirectly) and deaths that might have been erroneously assigned lung cancer as the cause of death. The algorithm also selected deaths that might be due to adverse events of diagnostic evaluation for lung cancer. Using the review cause of death as the gold standard and lung cancer cause of death as the outcome of interest (dichotomized as lung cancer versus not lung cancer), we calculated performance measures of the death certificate cause of death. We also recalculated the trial primary endpoint using the death certificate cause of death.ResultsIn all, 1642 deaths were reviewed and assigned a cause of death (42% of the 3877 National Lung Screening Trial deaths). Sensitivity of death certificate cause of death was 91%; specificity, 97%; positive predictive value, 98%; and negative predictive value, 89%. About 40% of the deaths reclassified to lung cancer cause of death had a death certificate cause of death of a neoplasm other than lung. Using the death certificate cause of death, the lung cancer mortality reduction was 18% (95% confidence interval: 4.2-25.0), as compared with the published finding of 20% (95% confidence interval: 6.7-26.7).ConclusionDeath review may not be necessary for primary-outcome analyses in lung cancer screening trials. If deemed necessary, researchers should strive to streamline the death review process as much as possible.

Project description:BackgroundIn recent years, mobile-based interventions have received more attention as an alternative to on-site obesity management. Despite increased mobile interventions for obesity, there are lost opportunities to achieve better outcomes due to the lack of a predictive model using current existing longitudinal and cross-sectional health data. Noom (Noom Inc) is a mobile app that provides various lifestyle-related logs including food logging, exercise logging, and weight logging.ObjectiveThe aim of this study was to develop a weight change predictive model using an interpretable artificial intelligence algorithm for mobile-based interventions and to explore contributing factors to weight loss.MethodsLifelog mobile app (Noom) user data of individuals who used the weight loss program for 16 weeks in the United States were used to develop an interpretable recurrent neural network algorithm for weight prediction that considers both time-variant and time-fixed variables. From a total of 93,696 users in the coaching program, we excluded users who did not take part in the 16-week weight loss program or who were not overweight or obese or had not entered weight or meal records for the entire 16-week program. This interpretable model was trained and validated with 5-fold cross-validation (training set: 70%; testing: 30%) using the lifelog data. Mean absolute percentage error between actual weight loss and predicted weight was used to measure model performance. To better understand the behavior factors contributing to weight loss or gain, we calculated contribution coefficients in test sets.ResultsA total of 17,867 users' data were included in the analysis. The overall mean absolute percentage error of the model was 3.50%, and the error of the model declined from 3.78% to 3.45% by the end of the program. The time-level attention weighting was shown to be equally distributed at 0.0625 each week, but this gradually decreased (from 0.0626 to 0.0624) as it approached 16 weeks. Factors such as usage pattern, weight input frequency, meal input adherence, exercise, and sharp decreases in weight trajectories had negative contribution coefficients of -0.021, -0.032, -0.015, and -0.066, respectively. For time-fixed variables, being male had a contribution coefficient of -0.091.ConclusionsAn interpretable algorithm, with both time-variant and time-fixed data, was used to precisely predict weight loss while preserving model transparency. This week-to-week prediction model is expected to improve weight loss and provide a global explanation of contributing factors, leading to better outcomes.

Dataset Information

A Deep Artificial Neural Network-Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation.

Publications

A Deep Artificial Neural Network-Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets