Ontology highlight
ABSTRACT: Objective
We propose an interpretable disease prediction model that efficiently fuses multiple types of patient records using a self-attentive fusion encoder. We assessed the model performance in predicting cardiovascular disease events, given the records of a general patient population.Materials and methods
We extracted 798111 ses and 67 623 controls from the sample cohort database and nationwide healthcare claims data of South Korea. Among the information provided, our model used the sequential records of medical codes and patient characteristics, such as demographic profiles and the most recent health examination results. These two types of patient records were combined in our self-attentive fusion module, whereas previously dominant methods aggregated them using a simple concatenation. The prediction performance was compared to state-of-the-art recurrent neural network-based approaches and other widely used machine learning approaches.Results
Our model outperformed all the other compared methods in predicting cardiovascular disease events. It achieved an area under the curve of 0.839, while the other compared methods achieved between 0.74111 d 0.830. Moreover, our model consistently outperformed the other methods in a more challenging setting in which we tested the model's ability to draw an inference from more nonobvious, diverse factors.Discussion
We also interpreted the attention weights provided by our model as the relative importance of each time step in the sequence. We showed that our model reveals the informative parts of the patients' history by measuring the attention weights.Conclusion
We suggest an interpretable disease prediction model that efficiently fuses heterogeneous patient records and demonstrates superior disease prediction performance.
SUBMITTER: Kwak H
PROVIDER: S-EPMC8449612 | biostudies-literature |
REPOSITORIES: biostudies-literature