Dataset Information

Deep learning in clinical natural language processing: a methodical review.

ABSTRACT: OBJECTIVE:This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research. MATERIALS AND METHODS:We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers. RESULTS:DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a "long tail" of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific. DISCUSSION:Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning). CONCLUSION:Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field.

SUBMITTER: Wu S

PROVIDER: S-EPMC7025365 | biostudies-literature | 2020 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Deep learning in clinical natural language processing: a methodical review.

Wu Stephen S Roberts Kirk K Datta Surabhi S Du Jingcheng J Ji Zongcheng Z Si Yuqi Y Soni Sarvesh S Wang Qiong Q Wei Qiang Q Xiang Yang Y Zhao Bo B Xu Hua H

Journal of the American Medical Informatics Association : JAMIA 20200301 3

<h4>Objective</h4>This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research.<h4>Materials and methods</h4>We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to N ...[more]

PMID: 31794016

Similar Datasets

Project description:BackgroundMachine learning systems are part of the field of artificial intelligence that automatically learn models from data to make better decisions. Natural language processing (NLP), by using corpora and learning approaches, provides good performance in statistical tasks, such as text classification or sentiment mining.ObjectiveThe primary aim of this systematic review was to summarize and characterize, in methodological and technical terms, studies that used machine learning and NLP techniques for mental health. The secondary aim was to consider the potential use of these methods in mental health clinical practice.MethodsThis systematic review follows the PRISMA (Preferred Reporting Items for Systematic Review and Meta-analysis) guidelines and is registered with PROSPERO (Prospective Register of Systematic Reviews; number CRD42019107376). The search was conducted using 4 medical databases (PubMed, Scopus, ScienceDirect, and PsycINFO) with the following keywords: machine learning, data mining, psychiatry, mental health, and mental disorder. The exclusion criteria were as follows: languages other than English, anonymization process, case studies, conference papers, and reviews. No limitations on publication dates were imposed.ResultsA total of 327 articles were identified, of which 269 (82.3%) were excluded and 58 (17.7%) were included in the review. The results were organized through a qualitative perspective. Although studies had heterogeneous topics and methods, some themes emerged. Population studies could be grouped into 3 categories: patients included in medical databases, patients who came to the emergency room, and social media users. The main objectives were to extract symptoms, classify severity of illness, compare therapy effectiveness, provide psychopathological clues, and challenge the current nosography. Medical records and social media were the 2 major data sources. With regard to the methods used, preprocessing used the standard methods of NLP and unique identifier extraction dedicated to medical texts. Efficient classifiers were preferred rather than transparent functioning classifiers. Python was the most frequently used platform.ConclusionsMachine learning and NLP models have been highly topical issues in medicine in recent years and may be considered a new paradigm in medical research. However, these processes tend to confirm clinical hypotheses rather than developing entirely new information, and only one major category of the population (ie, social media users) is an imprecise cohort. Moreover, some language-specific features can improve the performance of NLP methods, and their extension to other languages should be more closely investigated. However, machine learning and NLP techniques provide useful information from unexplored data (ie, patients' daily habits that are usually inaccessible to care providers). Before considering It as an additional tool of mental health care, ethical issues remain and should be discussed in a timely manner. Machine learning and NLP methods may offer multiple perspectives in mental health research but should also be considered as tools to support clinical practice.

Project description:BackgroundNovel approaches that complement and go beyond evidence-based medicine are required in the domain of chronic diseases, given the growing incidence of such conditions on the worldwide population. A promising avenue is the secondary use of electronic health records (EHRs), where patient data are analyzed to conduct clinical and translational research. Methods based on machine learning to process EHRs are resulting in improved understanding of patient clinical trajectories and chronic disease risk prediction, creating a unique opportunity to derive previously unknown clinical insights. However, a wealth of clinical histories remains locked behind clinical narratives in free-form text. Consequently, unlocking the full potential of EHR data is contingent on the development of natural language processing (NLP) methods to automatically transform clinical text into structured clinical data that can guide clinical decisions and potentially delay or prevent disease onset.ObjectiveThe goal of the research was to provide a comprehensive overview of the development and uptake of NLP methods applied to free-text clinical notes related to chronic diseases, including the investigation of challenges faced by NLP methodologies in understanding clinical narratives.MethodsPreferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed and searches were conducted in 5 databases using "clinical notes," "natural language processing," and "chronic disease" and their variations as keywords to maximize coverage of the articles.ResultsOf the 2652 articles considered, 106 met the inclusion criteria. Review of the included papers resulted in identification of 43 chronic diseases, which were then further classified into 10 disease categories using the International Classification of Diseases, 10th Revision. The majority of studies focused on diseases of the circulatory system (n=38) while endocrine and metabolic diseases were fewest (n=14). This was due to the structure of clinical records related to metabolic diseases, which typically contain much more structured data, compared with medical records for diseases of the circulatory system, which focus more on unstructured data and consequently have seen a stronger focus of NLP. The review has shown that there is a significant increase in the use of machine learning methods compared to rule-based approaches; however, deep learning methods remain emergent (n=3). Consequently, the majority of works focus on classification of disease phenotype with only a handful of papers addressing extraction of comorbidities from the free text or integration of clinical notes with structured data. There is a notable use of relatively simple methods, such as shallow classifiers (or combination with rule-based methods), due to the interpretability of predictions, which still represents a significant issue for more complex methods. Finally, scarcity of publicly available data may also have contributed to insufficient development of more advanced methods, such as extraction of word embeddings from clinical notes.ConclusionsEfforts are still required to improve (1) progression of clinical NLP methods from extraction toward understanding; (2) recognition of relations among entities rather than entities in isolation; (3) temporal extraction to understand past, current, and future clinical events; (4) exploitation of alternative sources of clinical knowledge; and (5) availability of large-scale, de-identified clinical corpora.

Project description:Purpose Advances in artificial intelligence have produced a few predictive models in glaucoma, including a logistic regression model predicting glaucoma progression to surgery. However, uncertainty exists regarding how to integrate the wealth of information in free-text clinical notes. The purpose of this study was to predict glaucoma progression requiring surgery using deep learning (DL) approaches on data from electronic health records (EHRs), including features from structured clinical data and from natural language processing of clinical free-text notes. Design Development of DL predictive model in an observational cohort. Participants Adult patients with glaucoma at a single center treated from 2008 through 2020. Methods Ophthalmology clinical notes of patients with glaucoma were identified from EHRs. Available structured data included patient demographic information, diagnosis codes, prior surgeries, and clinical information including intraocular pressure, visual acuity, and central corneal thickness. In addition, words from patients’ first 120 days of notes were mapped to ophthalmology domain-specific neural word embeddings trained on PubMed ophthalmology abstracts. Word embeddings and structured clinical data were used as inputs to DL models to predict subsequent glaucoma surgery. Main Outcome Measures Evaluation metrics included area under the receiver operating characteristic curve (AUC) and F1 score, the harmonic mean of positive predictive value, and sensitivity on a held-out test set. Results Seven hundred forty-eight of 4512 patients with glaucoma underwent surgery. The model that incorporated both structured clinical features as well as input features from clinical notes achieved an AUC of 73% and F1 of 40%, compared with only structured clinical features, (AUC, 66%; F1, 34%) and only clinical free-text features (AUC, 70%; F1, 42%). All models outperformed predictions from a glaucoma specialist’s review of clinical notes (F1, 29.5%). Conclusions We can successfully predict which patients with glaucoma will need surgery using DL models on EHRs unstructured text. Models incorporating free-text data outperformed those using only structured inputs. Future predictive models using EHRs should make use of information from within clinical free-text notes to improve predictive performance. Additional research is needed to investigate optimal methods of incorporating imaging data into future predictive models as well.

Project description:BackgroundThe lack of automated tools for measuring care quality has limited the implementation of a national program to assess and improve guideline-directed care in heart failure with reduced ejection fraction (HFrEF). A key challenge for constructing such a tool has been an accurate, accessible approach for identifying patients with HFrEF at hospital discharge, an opportunity to evaluate and improve the quality of care.MethodsWe developed a novel deep learning-based language model for identifying patients with HFrEF from discharge summaries using a semi-supervised learning framework. For this purpose, hospitalizations with heart failure at Yale New Haven Hospital (YNHH) between 2015 to 2019 were labeled as HFrEF if the left ventricular ejection fraction was under 40% on antecedent echocardiography. The model was internally validated with model-based net reclassification improvement (NRI) assessed against chart-based diagnosis codes. We externally validated the model on discharge summaries from hospitalizations with heart failure at Northwestern Medicine, community hospitals of Yale New Haven Health in Connecticut and Rhode Island, and the publicly accessible MIMIC-III database, confirmed with chart abstraction.ResultsA total of 13,251 notes from 5,392 unique individuals (mean age 73 ± 14 years, 48% female), including 2,487 patients with HFrEF (46.1%), were used for model development (train/held-out test: 70/30%). The deep learning model achieved an area under receiving operating characteristic (AUROC) of 0.97 and an area under precision-recall curve (AUPRC) of 0.97 in detecting HFrEF on the held-out set. In external validation, the model had high performance in identifying HFrEF from discharge summaries with AUROC 0.94 and AUPRC 0.91 on 19,242 notes from Northwestern Medicine, AUROC 0.95 and AUPRC 0.96 on 139 manually abstracted notes from Yale community hospitals, and AUROC 0.91 and AUPRC 0.92 on 146 manually reviewed notes at MIMIC-III. Model-based prediction of HFrEF corresponded to an overall NRI of 60.2 ± 1.9% compared with the chart diagnosis codes (p-value < 0.001) and an increase in AUROC from 0.61 [95% CI: 060-0.63] to 0.91 [95% CI 0.90-0.92].ConclusionsWe developed and externally validated a deep learning language model that automatically identifies HFrEF from clinical notes with high precision and accuracy, representing a key element in automating quality assessment and improvement for individuals with HFrEF.

Dataset Information

Deep learning in clinical natural language processing: a methodical review.

Publications

Deep learning in clinical natural language processing: a methodical review.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets