Unknown

Dataset Information

0

Cohort selection for clinical trials: n2c2 2018 shared task track 1.


ABSTRACT: OBJECTIVE:Track 1 of the 2018 National NLP Clinical Challenges shared tasks focused on identifying which patients in a corpus of longitudinal medical records meet and do not meet identified selection criteria. MATERIALS AND METHODS:To address this challenge, we annotated American English clinical narratives for 288 patients according to whether they met these criteria. We chose criteria from existing clinical trials that represented a variety of natural language processing tasks, including concept extraction, temporal reasoning, and inference. RESULTS:A total of 47 teams participated in this shared task, with 224 participants in total. The participants represented 18 countries, and the teams submitted 109 total system outputs. The best-performing system achieved a micro F1 score of 0.91 using a rule-based approach. The top 10 teams used rule-based and hybrid systems to approach the problems. DISCUSSION:Clinical narratives are open to interpretation, particularly in cases where the selection criterion may be underspecified. This leaves room for annotators to use domain knowledge and intuition in selecting patients, which may lead to error in system outputs. However, teams who consulted medical professionals while building their systems were more likely to have high recall for patients, which is preferable for patient selection systems. CONCLUSIONS:There is not yet a 1-size-fits-all solution for natural language processing systems approaching this task. Future research in this area can look to examining criteria requiring even more complex inferences, temporal reasoning, and domain knowledge.

SUBMITTER: Stubbs A 

PROVIDER: S-EPMC6798568 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Cohort selection for clinical trials: n2c2 2018 shared task track 1.

Stubbs Amber A   Filannino Michele M   Soysal Ergin E   Henry Samuel S   Uzuner Özlem Ö  

Journal of the American Medical Informatics Association : JAMIA 20191101 11


<h4>Objective</h4>Track 1 of the 2018 National NLP Clinical Challenges shared tasks focused on identifying which patients in a corpus of longitudinal medical records meet and do not meet identified selection criteria.<h4>Materials and methods</h4>To address this challenge, we annotated American English clinical narratives for 288 patients according to whether they met these criteria. We chose criteria from existing clinical trials that represented a variety of natural language processing tasks,  ...[more]

Similar Datasets

| S-EPMC6798565 | biostudies-literature
| S-EPMC6798560 | biostudies-literature
| S-EPMC7647215 | biostudies-literature
| S-EPMC6087448 | biostudies-literature
| S-EPMC5836398 | biostudies-literature
| S-EPMC7489085 | biostudies-literature
| S-EPMC3706420 | biostudies-literature
| S-EPMC6282861 | biostudies-other
| S-EPMC4713836 | biostudies-other
| S-EPMC3499900 | biostudies-literature