Dataset Information

Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning.

ABSTRACT: Radiology reports contain a radiologist's interpretations of images, and these images frequently describe spatial relations. Important radiographic findings are mostly described in reference to an anatomical location through spatial prepositions. Such spatial relationships are also linked to various differential diagnoses and often described through uncertainty phrases. Structured representation of this clinically significant spatial information has the potential to be used in a variety of downstream clinical informatics applications. Our focus is to extract these spatial representations from the reports. For this, we first define a representation framework based on the Spatial Role Labeling (SpRL) scheme, which we refer to as Rad-SpRL. In Rad-SpRL, common radiological entities tied to spatial relations are encoded through four spatial roles: Trajector, Landmark, Diagnosis, and Hedge, all identified in relation to a spatial preposition (or Spatial Indicator). We annotated a total of 2,000 chest X-ray reports following Rad-SpRL. We then propose a deep learning-based natural language processing (NLP) method involving word and character-level encodings to first extract the Spatial Indicators followed by identifying the corresponding spatial roles. Specifically, we use a bidirectional long short-term memory (Bi-LSTM) conditional random field (CRF) neural network as the baseline model. Additionally, we incorporate contextualized word representations from pre-trained language models (BERT and XLNet) for extracting the spatial information. We evaluate both gold and predicted Spatial Indicators to extract the four types of spatial roles. The results are promising, with the highest average F1 measure for Spatial Indicator extraction being 91.29 (XLNet); the highest average overall F1 measure considering all the four spatial roles being 92.9 using gold Indicators (XLNet); and 85.6 using predicted Indicators (BERT pre-trained on MIMIC notes). The corpus is available in Mendeley at http://dx.doi.org/10.17632/yhb26hfz8n.1 and https://github.com/krobertslab/datasets/blob/master/Rad-SpRL.xml.

SUBMITTER: Datta S

PROVIDER: S-EPMC7807990 | biostudies-literature | 2020 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning.

Datta Surabhi S Si Yuqi Y Rodriguez Laritza L Shooshan Sonya E SE Demner-Fushman Dina D Roberts Kirk K

Journal of biomedical informatics 20200618

Radiology reports contain a radiologist's interpretations of images, and these images frequently describe spatial relations. Important radiographic findings are mostly described in reference to an anatomical location through spatial prepositions. Such spatial relationships are also linked to various differential diagnoses and often described through uncertainty phrases. Structured representation of this clinically significant spatial information has the potential to be used in a variety of downs ...[more]

PMID: 32562898

Dataset Information

Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning.

Publications

Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Weakly supervised spatial relation extraction from radiology reports.
| S-EPMC10122604 | biostudies-literature

Large language model-based information extraction from free-text radiology reports: a scoping review protocol.
| S-EPMC10729196 | biostudies-literature

Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study.
| S-EPMC8132979 | biostudies-literature

Utilizing Longitudinal Chest X-Rays and Reports to Pre-fill Radiology Reports.
| S-EPMC10947431 | biostudies-literature

A dataset of chest X-ray reports annotated with Spatial Role Labeling annotations.
| S-EPMC7451761 | biostudies-literature

Natural Language Processing to identify pneumonia from radiology reports.
| S-EPMC3811072 | biostudies-literature

Knowledge-enhanced visual-language pre-training on chest radiology images.
| S-EPMC10382552 | biostudies-literature

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.
| S-EPMC9879259 | biostudies-literature

Evaluating progress in automatic chest X-ray radiology report generation
| S-EPMC10499844 | biostudies-literature

Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment.
| S-EPMC7901713 | biostudies-literature