Dataset Information

Crowdsourced Assessment of Surgical Skill Proficiency in Cataract Surgery.

ABSTRACT:

Objective

To test whether crowdsourced lay raters can accurately assess cataract surgical skills.

Design

Two-armed study: independent cross-sectional and longitudinal cohorts.

Setting

Washington University Department of Ophthalmology.

Participants and methods

Sixteen cataract surgeons with varying experience levels submitted cataract surgery videos to be graded by 5 experts and 300+ crowdworkers masked to surgeon experience. Cross-sectional study: 50 videos from surgeons ranging from first-year resident to attending physician, pooled by years of training. Longitudinal study: 28 videos obtained at regular intervals as residents progressed through 180 cases. Surgical skill was graded using the modified Objective Structured Assessment of Technical Skill (mOSATS). Main outcome measures were overall technical performance, reliability indices, and correlation between expert and crowd mean scores.

Results

Experts demonstrated high interrater reliability and accurately predicted training level, establishing construct validity for the modified OSATS. Crowd scores were correlated with (r = 0.865, p < 0.0001) but consistently higher than expert scores for first, second, and third-year residents (p < 0.0001, paired t-test). Longer surgery duration negatively correlated with training level (r = -0.855, p < 0.0001) and expert score (r = -0.927, p < 0.0001). The longitudinal dataset reproduced cross-sectional study findings for crowd and expert comparisons. A regression equation transforming crowd score plus video length into expert score was derived from the cross-sectional dataset (r² = 0.92) and demonstrated excellent predictive modeling when applied to the independent longitudinal dataset (r² = 0.80). A group of student raters who had edited the cataract videos also graded them, producing scores that more closely approximated experts than the crowd.

Conclusions

Crowdsourced rankings correlated with expert scores, but were not equivalent; crowd scores overestimated technical competency, especially for novice surgeons. A novel approach of adjusting crowd scores with surgery duration generated a more accurate predictive model for surgical skill. More studies are needed before crowdsourcing can be reliably used for assessing surgical proficiency.

SUBMITTER: Paley GL

PROVIDER: S-EPMC8217126 | biostudies-literature | 2021 Jul-Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Crowdsourced Assessment of Surgical Skill Proficiency in Cataract Surgery.

Paley Grace L GL Grove Rebecca R Sekhar Tejas C TC Pruett Jack J Stock Michael V MV Pira Tony N TN Shields Steven M SM Waxman Evan L EL Wilson Bradley S BS Gordon Mae O MO Culican Susan M SM

Journal of surgical education 20210225 4

<h4>Objective</h4>To test whether crowdsourced lay raters can accurately assess cataract surgical skills.<h4>Design</h4>Two-armed study: independent cross-sectional and longitudinal cohorts.<h4>Setting</h4>Washington University Department of Ophthalmology.<h4>Participants and methods</h4>Sixteen cataract surgeons with varying experience levels submitted cataract surgery videos to be graded by 5 experts and 300+ crowdworkers masked to surgeon experience. Cross-sectional study: 50 videos from surg ...[more]

PMID: 33640326

Similar Datasets

Project description:ImportanceComplex surgical interventions are inherently prone to variation yet they are not objectively measured. The reasons for outcome differences following cancer surgery are unclear.ObjectiveTo quantify surgical skill within advanced laparoscopic procedures and its association with histopathological and clinical outcomes.Design, setting, and participantsThis analysis of data and video from the Australasian Laparoscopic Cancer of Rectum (ALaCaRT) and 2-dimensional/3-dimensional (2D3D) multicenter randomized laparoscopic total mesorectal excision trials, which were conducted at 28 centers in Australia, the United Kingdom, and New Zealand, was performed from 2018 to 2019 and included 176 patients with clinical T1 to T3 rectal adenocarcinoma 15 cm or less from the anal verge. Case videos underwent blinded objective analysis using a bespoke performance assessment tool developed with a 62-international expert Delphi exercise and workshop, interview, and pilot phases.InterventionsLaparoscopic total mesorectal excision undertaken with curative intent by 34 credentialed surgeons.Main outcomes and measuresHistopathological (plane of mesorectal dissection, ALaCaRT composite end point success [mesorectal fascial plane, circumferential margin, ≥1 mm; distal margin, ≥1 mm]) and 30-day morbidity. End points were analyzed using surgeon quartiles defined by tool scores.ResultsThe laparoscopic total mesorectal excision performance tool was produced and shown to be reliable and valid for the specialist level (intraclass correlation coefficient, 0.889; 95% CI, 0.832-0.926; P < .001). A substantial variation in tool scores was recorded (range, 25-48). Scores were associated with the number of intraoperative errors, plane of mesorectal dissection, and short-term patient morbidity, including the number and severity of complications. Upper quartile-scoring surgeons obtained excellent results compared with the lower quartile (mesorectal fascial plane: 93% vs 59%; number needed to treat [NNT], 2.9, P = .002; ALaCaRT end point success, 83% vs 58%; NNT, 4; P = .03; 30-day morbidity, 23% vs 50%; NNT, 3.7; P = .03).Conclusions and relevanceIntraoperative surgical skill can be objectively and reliably measured in complex cancer interventions. Substantial variation in technical performance among credentialed surgeons is seen and significantly associated with clinical and pathological outcomes.

Project description:PurposeSurgeons' skill in the operating room is a major determinant of patient outcomes. Assessment of surgeons' skill is necessary to improve patient outcomes and quality of care through surgical training and coaching. Methods for video-based assessment of surgical skill can provide objective and efficient tools for surgeons. Our work introduces a new method based on attention mechanisms and provides a comprehensive comparative analysis of state-of-the-art methods for video-based assessment of surgical skill in the operating room.MethodsUsing a dataset of 99 videos of capsulorhexis, a critical step in cataract surgery, we evaluated image feature-based methods and two deep learning methods to assess skill using RGB videos. In the first method, we predict instrument tips as keypoints and predict surgical skill using temporal convolutional neural networks. In the second method, we propose a frame-wise encoder (2D convolutional neural network) followed by a temporal model (recurrent neural network), both of which are augmented by visual attention mechanisms. We computed the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and predictive values through fivefold cross-validation.ResultsTo classify a binary skill label (expert vs. novice), the range of AUC estimates was 0.49 (95% confidence interval; CI = 0.37 to 0.60) to 0.76 (95% CI = 0.66 to 0.85) for image feature-based methods. The sensitivity and specificity were consistently high for none of the methods. For the deep learning methods, the AUC was 0.79 (95% CI = 0.70 to 0.88) using keypoints alone, 0.78 (95% CI = 0.69 to 0.88) and 0.75 (95% CI = 0.65 to 0.85) with and without attention mechanisms, respectively.ConclusionDeep learning methods are necessary for video-based assessment of surgical skill in the operating room. Attention mechanisms improved discrimination ability of the network. Our findings should be evaluated for external validity in other datasets.

Project description:OBJECTIVES:To investigate differences in surgical time, the distance the surgical instrument travelled and number of movements required to complete manual phacoemulsification cataract surgery versus femtosecond laser cataract surgery. DESIGN:Non-randomised comparative case series. SETTING:Single surgery site, Moorfields Eye Hospital, UK. PARTICIPANTS:40 cataract surgeries of 40 patients. INTERVENTIONS:Laser-assisted and manual phacoemulsification cataract surgery. Laser-assisted surgery cases were performed using the AMO Catalys platform. PRIMARY AND SECONDARY OUTCOME MEASURES:Computer vision tracking software PhacoTracking were applied to the recordings to establish the distance the instrument travelled, total number of movements (the number of times an instrument stops and starts moving) and time taken for surgery steps including phacoemulsification, irrigation-aspiration (IA) and overall surgery time. The time taken for laser docking and delivery was not included in the analyses. RESULTS:Data on 19 laser-assisted and 19 manual phacoemulsification surgeries were analysed (two cases were excluded due to insufficient video-recording quality). There were no differences in the number of instrument moves, the distance the instrument travelled or time taken to complete the phacoemulsification stage. However for IA, the number of instrument moves (manual: mean 20 (SD 15) vs laser: mean 38 (SD 22), P=0.008) and time taken (manual: mean 75 s (SD 24) vs laser: mean 108 s (SD 36), P=0.003) were significantly greater for laser cases. For laser versus manual cases overall, there was no difference in number of moves or the distance the instrument travelled, but laser cases took longer (mean 88 s, P=0.049). CONCLUSIONS:Laser cataract surgery cases took longer to complete without accounting for the time taken to complete the laser procedure itself. This appears to be in part due to IA requiring more instrument manoeuvres and taking longer to complete. Data from a large randomised series would better elucidate this relationship.

Project description:AimsIn cardiac device implantation, having both surgical skills and ability to manipulate catheter/lead/wire is crucial. Few cardiologists, however, receive formal surgical training prior to implanting. Skills are mostly acquired directly on-the-job and surgical technique varies across institutions; suboptimal approaches may increase complications. We investigated how novel proficiency-based progression (PBP) simulation training impacts the surgical quality of implantations, compared to traditional simulation (SIM) training.Methods and resultsIn this international prospective study, novice implanters were randomized (blinded) 1:1 to participate in a simulation-based procedure training curriculum, with proficiency demonstration requirements for advancing (PBP approach) or without (SIM). Ultimately, trainees performed the surgical tasks of an implant on a porcine tissue that was video-recorded and then scored by two independent assessors (blinded to group), using previously validated performance metrics. Primary outcomes were the number of procedural Steps Completed, Critical Errors, Errors (non-critical), and All Errors Combined. Thirty novice implanters from 10 countries participated. Baseline experiences were similar between groups. Compared to SIM-trained, the PBP-trained group completed on average 11% more procedural Steps (P < 0.001) and made 61.2% fewer Critical Errors (P < 0.001), 57.1% fewer Errors (P = 0.140), and 60.7% fewer All Errors Combined (P = 0.001); 11/15 (73%) PBP trainees demonstrated the predefined target performance level vs. 3/15 SIM trainees (20%) in the video-recorded performance.ConclusionProficiency-based progression training produces superior objectively assessed novice operators' surgical performance in device implantation compared with traditional (simulation) training. Systematic PBP incorporation into formal academic surgical skills training is recommended before in vivo device practice. Future studies will quantify PBP training's effect on surgery-related device complications.

Project description:The purpose of this study was to characterize the motion features of surgical devices associated with laparoscopic surgical competency and build an automatic skill-credential system in porcine cadaver organ simulation training. Participants performed tissue dissection around the aorta, dividing vascular pedicles after applying Hem-o-lok (tissue dissection task) and parenchymal closure of the kidney (suturing task). Movements of surgical devices were tracked by a motion capture (Mocap) system, and Mocap-metrics were compared according to the level of surgical experience (experts: ≥50 laparoscopic surgeries, intermediates: 10-49, novices: 0-9), using the Kruskal-Wallis test and principal component analysis (PCA). Three machine-learning algorithms: support vector machine (SVM), PCA-SVM, and gradient boosting decision tree (GBDT), were utilized for discrimination of the surgical experience level. The accuracy of each model was evaluated by nested and repeated k-fold cross-validation. A total of 32 experts, 18 intermediates, and 20 novices participated in the present study. PCA revealed that efficiency-related metrics (e.g., path length) significantly contributed to PC 1 in both tasks. Regarding PC 2, speed-related metrics (e.g., velocity, acceleration, jerk) of right-hand devices largely contributed to the tissue dissection task, while those of left-hand devices did in the suturing task. Regarding the three-group discrimination, in the tissue dissection task, the GBDT method was superior to the other methods (median accuracy: 68.6%). In the suturing task, SVM and PCA-SVM methods were superior to the GBDT method (57.4 and 58.4%, respectively). Regarding the two-group discrimination (experts vs. intermediates/novices), the GBDT method resulted in a median accuracy of 72.9% in the tissue dissection task, and, in the suturing task, the PCA-SVM method resulted in a median accuracy of 69.2%. Overall, the mocap-based credential system using machine-learning classifiers provides a correct judgment rate of around 70% (two-group discrimination). Together with motion analysis and wet-lab training, simulation training could be a practical method for objectively assessing the surgical competence of trainees.

Project description:Purpose The aim of this article is to investigate the impact of a 1-minute video describing resident training with a cataract surgical simulator on patients' perceptions regarding resident involvement in cataract surgery and to identify factors associated with patient willingness to have cataract surgery performed by a resident. Design Cross-sectional survey. Methods An anonymous Likert-style survey was conducted among 430 consecutive adult patients who presented for eye examination at the Penn State Health Eye Center. The survey included questions regarding demographics, understanding of the medical training hierarchy, and patient willingness to have a resident perform their cataract surgery. There were six questions regarding patient willingness to have residents perform their cataract surgery and the second question in this set informs the patient that residents are supervised by an experienced cataract surgeon. Patients were randomly assigned to one of two groups: patients in Group 1 completed the survey only, while patients in Group 2 watched a 1-minute video describing resident training with a cataract surgical simulator prior to completing the survey. Results Four hundred fourteen of the 430 patients (96.3%) completed the survey. Overall, 24.7% ( n = 102) of respondents expressed willingness to allow an ophthalmology resident to perform their cataract surgery, and that proportion increased to 54.0% ( n = 223) if the patient was informed that the resident would be supervised by an experienced cataract surgeon. Patients in Group 2 were twice as likely compared with patients in Group 1 to express willingness to allow an ophthalmology resident to perform their cataract surgery (odds ratio 1.92 [1.18-3.11], p = 0.009). Conclusions A thorough informed consent process including information regarding attending supervision and a brief video detailing resident training with a cataract surgery simulator may increase patient willingness to allow resident participation in cataract surgery.

Dataset Information

Crowdsourced Assessment of Surgical Skill Proficiency in Cataract Surgery.

Objective

Design

Setting

Participants and methods

Results

Conclusions

Publications

Crowdsourced Assessment of Surgical Skill Proficiency in Cataract Surgery.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets