Dataset Information

Towards generalizable entity-centric clinical coreference resolution.

ABSTRACT: OBJECTIVE:This work investigates the problem of clinical coreference resolution in a model that explicitly tracks entities, and aims to measure the performance of that model in both traditional in-domain train/test splits and cross-domain experiments that measure the generalizability of learned models. METHODS:The two methods we compare are a baseline mention-pair coreference system that operates over pairs of mentions with best-first conflict resolution and a mention-synchronous system that incrementally builds coreference chains. We develop new features that incorporate distributional semantics, discourse features, and entity attributes. We use two new coreference datasets with similar annotation guidelines - the THYME colon cancer dataset and the DeepPhe breast cancer dataset. RESULTS:The mention-synchronous system performs similarly on in-domain data but performs much better on new data. Part of speech tag features prove superior in feature generalizability experiments over other word representations. Our methods show generalization improvement but there is still a performance gap when testing in new domains. DISCUSSION:Generalizability of clinical NLP systems is important and under-studied, so future work should attempt to perform cross-domain and cross-institution evaluations and explicitly develop features and training regimens that favor generalizability. A performance-optimized version of the mention-synchronous system will be included in the open source Apache cTAKES software.

SUBMITTER: Miller T

PROVIDER: S-EPMC5508069 | biostudies-literature | 2017 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Towards generalizable entity-centric clinical coreference resolution.

Miller Timothy T Dligach Dmitriy D Bethard Steven S Lin Chen C Savova Guergana G

Journal of biomedical informatics 20170421

<h4>Objective</h4>This work investigates the problem of clinical coreference resolution in a model that explicitly tracks entities, and aims to measure the performance of that model in both traditional in-domain train/test splits and cross-domain experiments that measure the generalizability of learned models.<h4>Methods</h4>The two methods we compare are a baseline mention-pair coreference system that operates over pairs of mentions with best-first conflict resolution and a mention-synchronous ...[more]

PMID: 28438706

Dataset Information

Towards generalizable entity-centric clinical coreference resolution.

Publications

Towards generalizable entity-centric clinical coreference resolution.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

MCORES: a system for noun phrase coreference resolution for clinical records.
| S-EPMC3422821 | biostudies-literature

Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives.
| S-EPMC3638172 | biostudies-literature

Coreference analysis in clinical notes: a multi-pass sieve with alternate anaphora resolution modules.
| S-EPMC3422831 | biostudies-other

Evaluating the state of the art in coreference resolution for electronic medical records.
| S-EPMC3422835 | biostudies-literature

Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction.
| S-EPMC4962668 | biostudies-literature

Towards generalizable predictions for G protein-coupled receptor variant expression.
| S-EPMC9382327 | biostudies-literature

Generalizable biomarker prediction from cancer pathology slides with self-supervised deep learning: A retrospective multi-centric study.
| S-EPMC10140458 | biostudies-literature

Towards reliable named entity recognition in the biomedical domain.
| S-EPMC6956779 | biostudies-literature

Coreference resolution of medical concepts in discharge summaries by exploiting contextual information.
| S-EPMC3422837 | biostudies-other

Towards accurate and reliable resolution of structural variants for clinical diagnosis.
| S-EPMC8892125 | biostudies-literature