Unknown

Dataset Information

0

Text Mining of Electronic Health Records Can Accurately Identify and Characterize Patients With Systemic Lupus Erythematosus.


ABSTRACT:

Objective

Electronic health records (EHR) are increasingly being recognized as a major source of data reusable for medical research and quality monitoring, although patient identification and assessment of symptoms (characterization) remain challenging, especially in complex diseases such as systemic lupus erythematosus (SLE). Current coding systems are unable to assess information recorded in the physician's free-text notes. This study shows that text mining can be used as a reliable alternative.

Methods

In a multidisciplinary research team of data scientists and medical experts, a text mining algorithm on 4607 patient records was developed to assess the diagnosis of 14 different immune-mediated inflammatory diseases and the presence of 18 different symptoms in the EHR. The text mining algorithm included key words in the EHR, while mining the context for exclusion phrases. The accuracy of the text mining algorithm was assessed by manually checking the EHR of 100 random patients suspected of having SLE for diagnoses and symptoms and comparing the outcome with the outcome of the text mining algorithm.

Results

After evaluation of 100 patient records, the text mining algorithm had a sensitivity of 96.4% and a specificity of 93.3% in assessing the presence of SLE. The algorithm detected potentially life-threatening symptoms (nephritis, pleuritis) with good sensitivity (80%-82%) and high specificity (97%-97%).

Conclusion

We present a text mining algorithm that can accurately identify and characterize patients with SLE using routinely collected data from the EHR. Our study shows that using text mining, data from the EHR can be reused in research and quality control.

SUBMITTER: Brunekreef TE 

PROVIDER: S-EPMC7882527 | biostudies-literature | 2021 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Text Mining of Electronic Health Records Can Accurately Identify and Characterize Patients With Systemic Lupus Erythematosus.

Brunekreef Tammo E TE   Otten Henny G HG   van den Bosch Suzanne C SC   Hoefer Imo E IE   van Laar Jacob M JM   Limper Maarten M   Haitjema Saskia S  

ACR open rheumatology 20210112 2


<h4>Objective</h4>Electronic health records (EHR) are increasingly being recognized as a major source of data reusable for medical research and quality monitoring, although patient identification and assessment of symptoms (characterization) remain challenging, especially in complex diseases such as systemic lupus erythematosus (SLE). Current coding systems are unable to assess information recorded in the physician's free-text notes. This study shows that text mining can be used as a reliable al  ...[more]

Similar Datasets

| S-EPMC5219863 | biostudies-literature
| S-EPMC10787061 | biostudies-literature
| S-EPMC5568397 | biostudies-literature
| S-EPMC10060252 | biostudies-literature
2014-06-03 | E-GEOD-46923 | biostudies-arrayexpress
2014-06-03 | GSE46923 | GEO
| S-EPMC1440614 | biostudies-literature
| S-EPMC2048842 | biostudies-literature
| S-EPMC4716148 | biostudies-literature
| PRJNA203032 | ENA