Unknown

Dataset Information

0

Selective prediction for extracting unstructured clinical data.


ABSTRACT:

Objective

While there are currently approaches to handle unstructured clinical data, such as manual abstraction and structured proxy variables, these methods may be time-consuming, not scalable, and imprecise. This article aims to determine whether selective prediction, which gives a model the option to abstain from generating a prediction, can improve the accuracy and efficiency of unstructured clinical data abstraction.

Materials and methods

We trained selective classifiers (logistic regression, random forest, support vector machine) to extract 5 variables from clinical notes: depression (n = 1563), glioblastoma (GBM, n = 659), rectal adenocarcinoma (DRA, n = 601), and abdominoperineal resection (APR, n = 601) and low anterior resection (LAR, n = 601) of adenocarcinoma. We varied the cost of false positives (FP), false negatives (FN), and abstained notes and measured total misclassification cost.

Results

The depression selective classifiers abstained on anywhere from 0% to 97% of notes, and the change in total misclassification cost ranged from -58% to 9%. Selective classifiers abstained on 5%-43% of notes across the GBM and colorectal cancer models. The GBM selective classifier abstained on 43% of notes, which led to improvements in sensitivity (0.94 to 0.96), specificity (0.79 to 0.96), PPV (0.89 to 0.98), and NPV (0.88 to 0.91) when compared to a non-selective classifier and when compared to structured proxy variables.

Discussion

We showed that selective classifiers outperformed both non-selective classifiers and structured proxy variables for extracting data from unstructured clinical notes.

Conclusion

Selective prediction should be considered when abstaining is preferable to making an incorrect prediction.

SUBMITTER: Swaminathan A 

PROVIDER: S-EPMC10746316 | biostudies-literature | 2023 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications


<h4>Objective</h4>While there are currently approaches to handle unstructured clinical data, such as manual abstraction and structured proxy variables, these methods may be time-consuming, not scalable, and imprecise. This article aims to determine whether selective prediction, which gives a model the option to abstain from generating a prediction, can improve the accuracy and efficiency of unstructured clinical data abstraction.<h4>Materials and methods</h4>We trained selective classifiers (log  ...[more]

Similar Datasets

| S-EPMC7725544 | biostudies-literature
| S-EPMC7559204 | biostudies-literature
| S-EPMC8684269 | biostudies-literature
| S-EPMC11462099 | biostudies-literature
| S-EPMC10031450 | biostudies-literature
| S-EPMC9196702 | biostudies-literature
| S-EPMC7846756 | biostudies-literature
| S-EPMC6005735 | biostudies-literature
| S-EPMC9378826 | biostudies-literature
| S-EPMC8895286 | biostudies-literature