Unknown

Dataset Information

0

Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse.


ABSTRACT: Objective:The repurposing of electronic health records (EHRs) can improve clinical and genetic research for rare diseases. However, significant information in rare disease EHRs is embedded in the narrative reports, which contain many negated clinical signs and family medical history. This paper presents a method to detect family history and negation in narrative reports and evaluates its impact on selecting populations from a clinical data warehouse (CDW). Materials and Methods:We developed a pipeline to process 1.6 million reports from multiple sources. This pipeline is part of the load process of the Necker Hospital CDW. Results:We identified patients with "Lupus and diarrhea," "Crohn's and diabetes," and "NPHP1" from the CDW. The overall precision, recall, specificity, and F-measure were 0.85, 0.98, 0.93, and 0.91, respectively. Conclusion:The proposed method generates a highly accurate identification of cases from a CDW of rare disease EHRs.

SUBMITTER: Garcelon N 

PROVIDER: S-EPMC7651926 | biostudies-literature | 2017 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse.

Garcelon Nicolas N   Neuraz Antoine A   Benoit Vincent V   Salomon Rémi R   Burgun Anita A  

Journal of the American Medical Informatics Association : JAMIA 20170501 3


<h4>Objective</h4>The repurposing of electronic health records (EHRs) can improve clinical and genetic research for rare diseases. However, significant information in rare disease EHRs is embedded in the narrative reports, which contain many negated clinical signs and family medical history. This paper presents a method to detect family history and negation in narrative reports and evaluates its impact on selecting populations from a clinical data warehouse (CDW).<h4>Materials and methods</h4>We  ...[more]

Similar Datasets

| S-EPMC2572701 | biostudies-literature
| S-EPMC6513154 | biostudies-literature
| S-EPMC2939881 | biostudies-literature
| S-EPMC3667078 | biostudies-literature
| S-EPMC6602493 | biostudies-literature
| S-EPMC4177555 | biostudies-literature
| S-EPMC6602571 | biostudies-literature
| S-EPMC6398827 | biostudies-literature
| S-EPMC1780044 | biostudies-literature
| PRJEB4647 | ENA