Dataset Information

Limited generalizability of single deep neural network for surgical instrument segmentation in different surgical environments.

ABSTRACT: Clarifying the generalizability of deep-learning-based surgical-instrument segmentation networks in diverse surgical environments is important in recognizing the challenges of overfitting in surgical-device development. This study comprehensively evaluated deep neural network generalizability for surgical instrument segmentation using 5238 images randomly extracted from 128 intraoperative videos. The video dataset contained 112 laparoscopic colorectal resection, 5 laparoscopic distal gastrectomy, 5 laparoscopic cholecystectomy, and 6 laparoscopic partial hepatectomy cases. Deep-learning-based surgical-instrument segmentation was performed for test sets with (1) the same conditions as the training set; (2) the same recognition target surgical instrument and surgery type but different laparoscopic recording systems; (3) the same laparoscopic recording system and surgery type but slightly different recognition target laparoscopic surgical forceps; (4) the same laparoscopic recording system and recognition target surgical instrument but different surgery types. The mean average precision and mean intersection over union for test sets 1, 2, 3, and 4 were 0.941 and 0.887, 0.866 and 0.671, 0.772 and 0.676, and 0.588 and 0.395, respectively. Therefore, the recognition accuracy decreased even under slightly different conditions. The results of this study reveal the limited generalizability of deep neural networks in the field of surgical artificial intelligence and caution against deep-learning-based biased datasets and models.Trial Registration Number: 2020-315, date of registration: October 5, 2020.

SUBMITTER: Kitaguchi D

PROVIDER: S-EPMC9307578 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:PURPOSE:To describe and evaluate a new segmentation method using deep convolutional neural network (CNN), 3D fully connected conditional random field (CRF), and 3D simplex deformable modeling to improve the efficiency and accuracy of knee joint tissue segmentation. METHODS:A segmentation pipeline was built by combining a semantic segmentation CNN, 3D fully connected CRF, and 3D simplex deformable modeling. A convolutional encoder-decoder network was designed as the core of the segmentation method to perform high resolution pixel-wise multi-class tissue classification for 12 different joint structures. The 3D fully connected CRF was applied to regularize contextual relationship among voxels within the same tissue class and between different classes. The 3D simplex deformable modeling refined the output from 3D CRF to preserve the overall shape and maintain a desirable smooth surface for joint structures. The method was evaluated on 3D fast spin-echo (3D-FSE) MR image data sets. Quantitative morphological metrics were used to evaluate the accuracy and robustness of the method in comparison to the ground truth data. RESULTS:The proposed segmentation method provided good performance for segmenting all knee joint structures. There were 4 tissue types with high mean Dice coefficient above 0.9 including the femur, tibia, muscle, and other non-specified tissues. There were 7 tissue types with mean Dice coefficient between 0.8 and 0.9 including the femoral cartilage, tibial cartilage, patella, patellar cartilage, meniscus, quadriceps and patellar tendon, and infrapatellar fat pad. There was 1 tissue type with mean Dice coefficient between 0.7 and 0.8 for joint effusion and Baker's cyst. Most musculoskeletal tissues had a mean value of average symmetric surface distance below 1 mm. CONCLUSION:The combined CNN, 3D fully connected CRF, and 3D deformable modeling approach was well-suited for performing rapid and accurate comprehensive tissue segmentation of the knee joint. The deep learning-based segmentation method has promising potential applications in musculoskeletal imaging.

Project description:ImportanceDeep learning-based automatic surgical instrument recognition is an indispensable technology for surgical research and development. However, pixel-level recognition with high accuracy is required to make it suitable for surgical automation.ObjectiveTo develop a deep learning model that can simultaneously recognize 8 types of surgical instruments frequently used in laparoscopic colorectal operations and evaluate its recognition performance.Design, setting, and participantsThis quality improvement study was conducted at a single institution with a multi-institutional data set. Laparoscopic colorectal surgical videos recorded between April 1, 2009, and December 31, 2021, were included in the video data set. Deep learning-based instance segmentation, an image recognition approach that recognizes each object individually and pixel by pixel instead of roughly enclosing with a bounding box, was performed for 8 types of surgical instruments.Main outcomes and measuresAverage precision, calculated from the area under the precision-recall curve, was used as an evaluation metric. The average precision represents the number of instances of true-positive, false-positive, and false-negative results, and the mean average precision value for 8 types of surgical instruments was calculated. Five-fold cross-validation was used as the validation method. The annotation data set was split into 5 segments, of which 4 were used for training and the remainder for validation. The data set was split at the per-case level instead of the per-frame level; thus, the images extracted from an intraoperative video in the training set never appeared in the validation set. Validation was performed for all 5 validation sets, and the average mean average precision was calculated.ResultsIn total, 337 laparoscopic colorectal surgical videos were used. Pixel-by-pixel annotation was manually performed for 81 760 labels on 38 628 static images, constituting the annotation data set. The mean average precisions of the instance segmentation for surgical instruments were 90.9% for 3 instruments, 90.3% for 4 instruments, 91.6% for 6 instruments, and 91.8% for 8 instruments.Conclusions and relevanceA deep learning-based instance segmentation model that simultaneously recognizes 8 types of surgical instruments with high accuracy was successfully developed. The accuracy was maintained even when the number of types of surgical instruments increased. This model can be applied to surgical innovations, such as intraoperative navigation and surgical automation.

Project description:PurposeTo quantitatively evaluate the generalizability of a deep learning segmentation tool to MRI data from scanners of different MRI manufacturers and to improve the cross-manufacturer performance by using a manufacturer-adaptation strategy.Materials and methodsThis retrospective study included 150 cine MRI datasets from three MRI manufacturers, acquired between 2017 and 2018 (n = 50 for manufacturer 1, manufacturer 2, and manufacturer 3). Three convolutional neural networks (CNNs) were trained to segment the left ventricle (LV), using datasets exclusively from images from a single manufacturer. A generative adversarial network (GAN) was trained to adapt the input image before segmentation. The LV segmentation performance, end-diastolic volume (EDV), end-systolic volume (ESV), LV mass, and LV ejection fraction (LVEF) were evaluated before and after manufacturer adaptation. Paired Wilcoxon signed rank tests were performed.ResultsThe segmentation CNNs exhibited a significant performance drop when applied to datasets from different manufacturers (Dice reduced from 89.7% ± 2.3 [standard deviation] to 68.7% ± 10.8, P < .05, from 90.6% ± 2.1 to 59.5% ± 13.3, P < .05, from 89.2% ± 2.3 to 64.1% ± 12.0, P < .05, for manufacturer 1, 2, and 3, respectively). After manufacturer adaptation, the segmentation performance was significantly improved (from 68.7% ± 10.8 to 84.3% ± 6.2, P < .05, from 72.4% ± 10.2 to 85.7% ± 6.5, P < .05, for manufacturer 2 and 3, respectively). Quantitative LV function parameters were also significantly improved. For LVEF, the manufacturer adaptation increased the Pearson correlation from 0.005 to 0.89 for manufacturer 2 and from 0.77 to 0.94 for manufacturer 3.ConclusionA segmentation CNN well trained on datasets from one MRI manufacturer may not generalize well to datasets from other manufacturers. The proposed manufacturer adaptation can largely improve the generalizability of a deep learning segmentation tool without additional annotation.Supplemental material is available for this article.© RSNA, 2020.

Dataset Information

Limited generalizability of single deep neural network for surgical instrument segmentation in different surgical environments.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets