Dataset Information

Cohort selection for clinical trials: n2c2 2018 shared task track 1.

ABSTRACT:

Objective

Track 1 of the 2018 National NLP Clinical Challenges shared tasks focused on identifying which patients in a corpus of longitudinal medical records meet and do not meet identified selection criteria.

Materials and methods

To address this challenge, we annotated American English clinical narratives for 288 patients according to whether they met these criteria. We chose criteria from existing clinical trials that represented a variety of natural language processing tasks, including concept extraction, temporal reasoning, and inference.

Results

A total of 47 teams participated in this shared task, with 224 participants in total. The participants represented 18 countries, and the teams submitted 109 total system outputs. The best-performing system achieved a micro F1 score of 0.91 using a rule-based approach. The top 10 teams used rule-based and hybrid systems to approach the problems.

Discussion

Clinical narratives are open to interpretation, particularly in cases where the selection criterion may be underspecified. This leaves room for annotators to use domain knowledge and intuition in selecting patients, which may lead to error in system outputs. However, teams who consulted medical professionals while building their systems were more likely to have high recall for patients, which is preferable for patient selection systems.

Conclusions

There is not yet a 1-size-fits-all solution for natural language processing systems approaching this task. Future research in this area can look to examining criteria requiring even more complex inferences, temporal reasoning, and domain knowledge.

SUBMITTER: Stubbs A

PROVIDER: S-EPMC6798568 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Cohort selection for clinical trials: n2c2 2018 shared task track 1.

Stubbs Amber A Filannino Michele M Soysal Ergin E Henry Samuel S Uzuner Özlem Ö

Journal of the American Medical Informatics Association : JAMIA 20191101 11

<h4>Objective</h4>Track 1 of the 2018 National NLP Clinical Challenges shared tasks focused on identifying which patients in a corpus of longitudinal medical records meet and do not meet identified selection criteria.<h4>Materials and methods</h4>To address this challenge, we annotated American English clinical narratives for 288 patients according to whether they met these criteria. We chose criteria from existing clinical trials that represented a variety of natural language processing tasks, ...[more]

PMID: 31562516

Similar Datasets

Project description:ObjectiveCohort selection for clinical trials is a key step for clinical research. We proposed a hierarchical neural network to determine whether a patient satisfied selection criteria or not.Materials and methodsWe designed a hierarchical neural network (denoted as CNN-Highway-LSTM or LSTM-Highway-LSTM) for the track 1 of the national natural language processing (NLP) clinical challenge (n2c2) on cohort selection for clinical trials in 2018. The neural network is composed of 5 components: (1) sentence representation using convolutional neural network (CNN) or long short-term memory (LSTM) network; (2) a highway network to adjust information flow; (3) a self-attention neural network to reweight sentences; (4) document representation using LSTM, which takes sentence representations in chronological order as input; (5) a fully connected neural network to determine whether each criterion is met or not. We compared the proposed method with its variants, including the methods only using the first component to represent documents directly and the fully connected neural network for classification (denoted as CNN-only or LSTM-only) and the methods without using the highway network (denoted as CNN-LSTM or LSTM-LSTM). The performance of all methods was measured by micro-averaged precision, recall, and F1 score.ResultsThe micro-averaged F1 scores of CNN-only, LSTM-only, CNN-LSTM, LSTM-LSTM, CNN-Highway-LSTM, and LSTM-Highway-LSTM were 85.24%, 84.25%, 87.27%, 88.68%, 88.48%, and 90.21%, respectively. The highest micro-averaged F1 score is higher than our submitted 1 of 88.55%, which is 1 of the top-ranked results in the challenge. The results indicate that the proposed method is effective for cohort selection for clinical trials.DiscussionAlthough the proposed method achieved promising results, some mistakes were caused by word ambiguity, negation, number analysis and incomplete dictionary. Moreover, imbalanced data was another challenge that needs to be tackled in the future.ConclusionIn this article, we proposed a hierarchical neural network for cohort selection. Experimental results show that this method is good at selecting cohort.

Project description:There has been an exponential increase in randomized controlled trials (RCTs) on cerebrovascular disease within neurosurgery. The goal of this study was to review, outline the scope, and summarize all phase 2b and phase 3 RCTs impacting cerebrovascular neurosurgery practice since 2018. We searched PubMed, MEDLINE, Embase, ClinicalTrials.gov, and the Cochrane Central Register of Controlled Trials (CENTRAL) databases for relevant RCTs published between January 1, 2018, and July 1, 2022. We searched for studies related to eight major cerebrovascular disorders relevant to neurosurgery, including acute ischemic stroke, cerebral aneurysms and subarachnoid hemorrhage, intracerebral hemorrhage, subdural hematomas, cerebral venous thrombosis, arteriovenous malformations, Moyamoya disease and extracranial-intracranial bypass, and carotid and intracranial atherosclerosis. We limited our search to phase 2b or 3 RCTs related to cerebrovascular disorders published during the study period. The titles and abstracts of all relevant studies meeting our search criteria were included. Pediatric studies, stroke studies related to rehabilitation or cardiovascular disease, study protocols without published results, prospective cohort studies, registry studies, cluster randomized trials, and nonrandomized pivotal trials were excluded. From an initial total of 2,797 records retrieved from the database searches, 1,641 records were screened after duplicates and studies outside of our time period were removed. After screening, 511 available reports within our time period of interest were assessed for eligibility. Pediatric studies, stroke studies related to rehabilitation or cardiovascular disease, study protocols without published results, prospective cohort studies, registry studies, cluster randomized trials, and nonrandomized pivotal trials were excluded. We found 80 unique phase 2b or 3 RCTs that fit our criteria, with 165 topic-relevant articles published within the study period. Numerous RCTs in cerebrovascular neurosurgery have been published since 2018. Ischemic stroke, including mechanical thrombectomy and thrombolysis, accounted for a majority of publications, but there were large trials in intracerebral hemorrhage, subdural hemorrhage, aneurysms, subarachnoid hemorrhage, and cerebral venous thrombosis, among others. This review helps define the scope of the large RCTs published in the last four years to guide future research and clinical care.

Dataset Information

Cohort selection for clinical trials: n2c2 2018 shared task track 1.

Objective

Materials and methods

Results

Discussion

Conclusions

Publications

Cohort selection for clinical trials: n2c2 2018 shared task track 1.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets