Project description:BackgroundAccurate prediction is essential for the effective management of spontaneous pneumothorax (SP). To improve prediction, this study primarily focuses on using simple chest X-rays to predict ipsilateral recurrence and contralateral occurrence of SP.MethodsAll consecutive subjects diagnosed with SP from July 2017 to June 2023 were retrospectively reviewed. Ipsilateral recurrence and contralateral occurrence of SP within two years of completing treatment were analyzed. Using simple chest X-rays and clinical parameters such as age, sex, smoking, chronic obstructive pulmonary disease (COPD) and surgery, machine learning algorithms were applied to predict SP development. Gradient-weighted Class Activation Mapping (Grad-CAM) was used to highlight the X-ray regions associated with SP development.ResultsThe study included 1,086 cases of SP, with 546 right-side and 540 left-side developments. Surgeries were performed in 243 right and 204 left cases. Ipsilateral recurrence occurred in 93 cases total, while contralateral occurrence occurred in 60 right and 34 left cases. For predicting ipsilateral recurrence in the young group, gradient boosting (GB) [area under curve (AUC) of 0.686, accuracy of 0.769, F1 score of 0.733, precision of 0.706, and recall of 0.769] for the right side and logistic regression (AUC of 0.628, accuracy of 0.781, F1 score of 0.753, precision of 0.737, and recall of 0.781) for the left side were the top-performing models. In the older group, K-nearest neighbors (KNN) (AUC of 0.615, accuracy of 0.801, F1 score of 0.760, precision of 0.735, and recall of 0.801) for the right side and logistic regression (AUC of 0.623, accuracy of 0.824, F1 score of 0.804, precision of 0.794, and recall of 0.824) for the left side were the best models. For predicting contralateral occurrence in the young group, random forest (RF) (AUC of 0.597, accuracy of 0.774, F1 score of 0.741, precision of 0.709, and recall of 0.774) for the right side and KNN (AUC of 0.650, accuracy of 0.893, F1 score of 0.849, precision of 0.809, and recall of 0.893) for the left side were the most effective models. In the older group, logistic regression (AUC of 0.630, accuracy of 0.935, F1 score of 0.914, precision of 0.894, and recall of 0.935) for the right side and neural network (NN) (AUC of 0.765, accuracy of 0.961, F1 score of 0.948, precision of 0.936, and recall of 0.961) for the left side were the top performers. Grad-CAM analysis revealed that apical lung portions were strongly associated with both ipsilateral recurrence and contralateral occurrence of SP.ConclusionsThe results of this study suggest that machine learning algorithms using simple X-rays and basic clinical data can predict SP development with fair performance. The apical regions of the lung were strongly associated with SP development, consistent with clinical knowledge.
| S-EPMC11898339 | biostudies-literature