Unknown

Dataset Information

0

A hybrid approach to automatic de-identification of psychiatric notes.


ABSTRACT: De-identification, or identifying and removing protected health information (PHI) from clinical data, is a critical step in making clinical data available for clinical applications and research. This paper presents a natural language processing system for automatic de-identification of psychiatric notes, which was designed to participate in the 2016 CEGS N-GRID shared task Track 1. The system has a hybrid structure that combines machine leaning techniques and rule-based approaches. The rule-based components exploit the structure of the psychiatric notes as well as characteristic surface patterns of PHI mentions. The machine learning components utilize supervised learning with rich features. In addition, the system performance was boosted with integration of additional data to the training set through domain adaptation. The hybrid system showed overall micro-averaged F-score 90.74 on the test set, second-best among all the participants of the CEGS N-GRID task.

SUBMITTER: Lee HJ 

PROVIDER: S-EPMC5705430 | biostudies-literature | 2017 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

A hybrid approach to automatic de-identification of psychiatric notes.

Lee Hee-Jin HJ   Wu Yonghui Y   Zhang Yaoyun Y   Xu Jun J   Xu Hua H   Roberts Kirk K  

Journal of biomedical informatics 20170607


De-identification, or identifying and removing protected health information (PHI) from clinical data, is a critical step in making clinical data available for clinical applications and research. This paper presents a natural language processing system for automatic de-identification of psychiatric notes, which was designed to participate in the 2016 CEGS N-GRID shared task Track 1. The system has a hybrid structure that combines machine leaning techniques and rule-based approaches. The rule-base  ...[more]

Similar Datasets

| S-EPMC7787254 | biostudies-literature
| S-EPMC5705329 | biostudies-literature
| S-EPMC10906319 | biostudies-literature
| S-EPMC4989091 | biostudies-literature
| S-EPMC3907029 | biostudies-literature
| S-EPMC4927370 | biostudies-literature
| S-EPMC5891936 | biostudies-literature
| S-EPMC5705296 | biostudies-literature
| S-EPMC10958393 | biostudies-literature