Unknown

Dataset Information

0

Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling.


ABSTRACT: Accurate and efficient identification of complex chronic conditions in the electronic health record (EHR) is an important but challenging task that has historically relied on tedious clinician review and oversimplification of the disease. Here we adapt methods that allow for automated "noisy labeling" of positive and negative controls to create a "silver standard" for machine learning to automate identification of systemic lupus erythematosus (SLE). Our final model, which includes both structured data as well as text processing of clinical notes, outperformed all existing algorithms for SLE (AUC 0.97). In addition, we demonstrate how the probabilistic outputs of this model can be adapted to various clinical needs, selecting high thresholds when specificity is the priority and lower thresholds when a more inclusive patient population is desired. Deploying a similar methodology to other complex diseases has the potential to dramatically simplify the landscape of population identification in the EHR. MeSH terms:Electronic Health Records, Machine Learning, Lupus Erythematosus, Phenotype, Algorithms.

SUBMITTER: Murray SG 

PROVIDER: S-EPMC6308013 | biostudies-literature | 2019 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling.

Murray Sara G SG   Avati Anand A   Schmajuk Gabriela G   Yazdany Jinoos J  

Journal of the American Medical Informatics Association : JAMIA 20190101 1


Accurate and efficient identification of complex chronic conditions in the electronic health record (EHR) is an important but challenging task that has historically relied on tedious clinician review and oversimplification of the disease. Here we adapt methods that allow for automated "noisy labeling" of positive and negative controls to create a "silver standard" for machine learning to automate identification of systemic lupus erythematosus (SLE). Our final model, which includes both structure  ...[more]

Similar Datasets

2014-06-03 | E-GEOD-46923 | biostudies-arrayexpress
2014-06-03 | GSE46923 | GEO
| S-EPMC1440614 | biostudies-literature
| S-EPMC2048842 | biostudies-literature
| PRJNA203032 | ENA
| S-EPMC3042628 | biostudies-other
| S-EPMC3303577 | biostudies-other
| S-EPMC4423270 | biostudies-other
| S-EPMC8708507 | biostudies-literature
| S-EPMC2577778 | biostudies-other