Ontology highlight
ABSTRACT: Objective
While there are currently approaches to handle unstructured clinical data, such as manual abstraction and structured proxy variables, these methods may be time-consuming, not scalable, and imprecise. This article aims to determine whether selective prediction, which gives a model the option to abstain from generating a prediction, can improve the accuracy and efficiency of unstructured clinical data abstraction.Materials and methods
We trained selective classifiers (logistic regression, random forest, support vector machine) to extract 5 variables from clinical notes: depression (n = 1563), glioblastoma (GBM, n = 659), rectal adenocarcinoma (DRA, n = 601), and abdominoperineal resection (APR, n = 601) and low anterior resection (LAR, n = 601) of adenocarcinoma. We varied the cost of false positives (FP), false negatives (FN), and abstained notes and measured total misclassification cost.Results
The depression selective classifiers abstained on anywhere from 0% to 97% of notes, and the change in total misclassification cost ranged from -58% to 9%. Selective classifiers abstained on 5%-43% of notes across the GBM and colorectal cancer models. The GBM selective classifier abstained on 43% of notes, which led to improvements in sensitivity (0.94 to 0.96), specificity (0.79 to 0.96), PPV (0.89 to 0.98), and NPV (0.88 to 0.91) when compared to a non-selective classifier and when compared to structured proxy variables.Discussion
We showed that selective classifiers outperformed both non-selective classifiers and structured proxy variables for extracting data from unstructured clinical notes.Conclusion
Selective prediction should be considered when abstaining is preferable to making an incorrect prediction.
SUBMITTER: Swaminathan A
PROVIDER: S-EPMC10746316 | biostudies-literature | 2023 Dec
REPOSITORIES: biostudies-literature
Swaminathan Akshay A Lopez Ivan I Wang William W Srivastava Ujwal U Tran Edward E Bhargava-Shah Aarohi A Wu Janet Y JY Ren Alexander L AL Caoili Kaitlin K Bui Brandon B Alkhani Layth L Lee Susan S Mohit Nathan N Seo Noel N Macedo Nicholas N Cheng Winson W Liu Charles C Thomas Reena R Chen Jonathan H JH Gevaert Olivier O
Journal of the American Medical Informatics Association : JAMIA 20231201 1
<h4>Objective</h4>While there are currently approaches to handle unstructured clinical data, such as manual abstraction and structured proxy variables, these methods may be time-consuming, not scalable, and imprecise. This article aims to determine whether selective prediction, which gives a model the option to abstain from generating a prediction, can improve the accuracy and efficiency of unstructured clinical data abstraction.<h4>Materials and methods</h4>We trained selective classifiers (log ...[more]