Effect of Intelligent Tutor Induced Pausing on Learning Simulated Surgical Skills
- Conditions
- Surgical Education
- Interventions
- Behavioral: Experimental Group ICEMS verbal feedback with pause groupBehavioral: Experimental Group ICEMS audio feedback with pause and expert video
- Registration Number
- NCT06235788
- Lead Sponsor
- McGill University
- Brief Summary
Traditional training of surgical technical skills relies on mentorship from experienced surgeons, who continuously evaluate and change trainee performance to prevent errors and potential patient harm by providing verbal instructions. These educators may also pause the procedure, explaining the risks associated with the trainee's actions, and may personally demonstrate proper techniques to the students. Studies examining pausing while providing medical care outline that these approaches allow for learning.
An artificial intelligent (AI) tutoring system, the Intelligent Continuous Expertise Monitoring System (ICEMS), improves learning in a surgical simulated operation by providing trainees with verbal instructions upon error identification. However, the effect of including a pause during this AI teaching has not been studied. Therefore, the ICEMS post-error identification methodology has been altered to include a pause with the intelligent tutor voice instruction.
The aim of this study is to determine the effect of pausing on surgical skill acquisition and transfer among pre-medical and medical students. This will be done by comparing their performance in repeated simulated tumour resection tasks.
- Detailed Description
Background: Surgical skill assessment is shifting from a quantitative, time-based approach towards a qualitative evaluation of a trainee's competency. During surgical procedures, instructors continuously monitor trainee performance and utilize various teaching methods focused on enhancing acquisition of surgical skills. One such method includes pausing the operation, either to outline the risks associated with the trainee's performance or to personally demonstrate the best practice technique(s). Pausing in such situations has been shown to allow learners to re-assess best practice, interrupt negative momentum, and allow for learning. Specifically, pausing after an error can prevent introduction of new information that may affect one's ability to reflect on their error and reduce stress before continuing.
Rationale: The ICEMS, an AI tutoring system, was developed by our group using a Long-Short Term Memory deep learning algorithm to assess surgical performance and provide guidance. This was then integrated with the NeuroVR simulation platform. Using this AI system, the provision of verbal feedback on error identification demonstrated the potential of intelligent tutoring to improve learning in two previous randomized control trials (RCTs). However, these RCTs did not incorporate pausing methodology post-error identification. To further emulate the mentorship of an experienced surgeon in a clinical setting, the ICEMS platform has been modified to both initiate pausing when learner error is identified and provide a video demonstrating expert performance.
Research aims: To compare the effect of incorporating a pause after intelligent tutor instruction to intelligent tutor instruction alone on medical and pre-medical students' surgical skill acquisition and skill transfer.
Hypotheses:
1. The pause with video group will significantly improve the composite score between the first and sixth repetition of the practice scenario.
2. The pause with video group will have a composite score statistically higher than the control group in the sixth repetition of the practice scenario.
3. The pause with video group will have a composite score statistically non-inferior to the pause without video group in the sixth repetition of the practice scenario.
4. The pause with video group will have a global OSATS score statistically higher than the control group in the realistic scenario.
5. The pause with video group will have a global OSATS score statistically non-inferior to the pause without video group in the realistic scenario.
6. The pause with video group will not elect a difference in emotional stress or cognitive load compared to the control group.
Specific objectives:
1. To assess if the efficacy of AI mediated real-time tailored feedback combined with pausing methodology is statistically superior to AI mediated real-time tailored feedback alone in improving medical students' surgical skills on two virtual reality surgery tasks.
2. To determine if different emotions and cognitive load are elicited by the pausing methodology as compared to only hearing the AI mediated feedback.
Design: A three-arm single blinded randomized controlled trial of AI feedback with pausing methodology and an expert demonstration video versus AI feedback with only pausing methodology versus AI feedback alone.
Setting: Neurosurgical Simulation and Artificial Intelligence Learning Centre.
Participants: Students who are enrolled in a Quebec medical school in a preparatory year, and first and second year.
Task: Using the NeuroVR surgical simulator by CAE Healthcare, resect a simulated practice tumour six times and a complex simulated realistic brain tumour once using an Ultrasonic Aspirator and Bipolar pincers while minimizing bleeding and preserving the surrounding, simulated healthy brain structures.
Intervention: A 90-minute training session where participants will have seven simulated subpial tumour resection attempts (six repetitions of a simple practice scenario and one attempt at a complex realistic scenario). All participants will receive auditory feedback from the ICEMS but will differ in what follows:
1. Continuously perform the procedure (i.e., no pause) (Group 1);
2. A pause followed by a reflection period (Group 2);
3. A pause followed by an expert-level demonstration video and a reflection period (Group 3).
Auditory feedback will be based on 4 metrics:
1. Instrument tip separation distance;
2. Low bipolar force;
3. High aspirator force;
4. High bipolar force. Initially, feedback will only be given based on the first metric (instrument tip separation), but once a repetition is completed without receiving feedback, the subsequent repetition will assess the next metric in the list above, and so on and so forth.
Main outcomes and measures:
The two co-primary outcomes are:
1. The improvement in surgical performance, as dictated by the composite score computed by the previously validated evaluation module of the ICEMS. Performance improvement is measured by the difference in composite score between each of the 6 repetitions of the practice scenario. Transfer of learning will be measured by the participant's composite score for the complex realistic scenario.
2. The performance score of the participants in the complex realistic scenario, assessed by two blinded experts using the Objective Structured Assessment of Technical Skills (OSATS) global rating scale.
The secondary outcome is the differences in the strength of emotions elicited, measured before the practice scenario, immediately before the realistic scenario, and after completion of all attempts using the Duffy's Medical Emotional Scale (MES). Cognitive load will also be measured following completion of all tasks using Leppink's Cognitive Load Index (CLI). Both outcomes are measured using self-reports.
Statistical Analysis Plan: Participant data will be anonymized and stored. The ICEMS will assess the participant's surgical performance and provide a performance score at 0.2 second intervals throughout each repetition of the simulated surgical task. An average composite score will then be calculated for each repetition. Using ANCOVA, improvement in performance and participant learning will be assessed by comparing the composite score of the first practice scenario repetition (baseline) and the composite score of the sixth repetition (summative). Meanwhile, the composite score of the complex realistic scenario will be used to assess the transfer of learning using a one-way ANOVA. With an effect size of 0.25 and a significance of 0.05, a total sample size of 129 provides 80% power to detect a significant interaction.
Videos of participant performance in the complex realistic scenario will be evaluated by two blinded expert raters using the OSATS global rating scale. The OSATS score will be analyzed between groups using a one-way ANOVA to compare efficiency of learning and skill retention.
Emotional changes before, during, and after learning in the simulated scenarios will be evaluated using a two-way mixed ANOVA, while one-way ANOVA will be used to assess cognitive load after learning.
Recruitment & Eligibility
- Status
- RECRUITING
- Sex
- All
- Target Recruitment
- 129
- First and second year medical students who are actively enrolled in any Quebec institution who do not meet the exclusion criteria.
- Students actively enrolled in medical school in a preparatory year in any Quebec institution who do not meet the exclusion criteria.
- Participation in previous trials involving the NeuroVR (CAE Healthcare) simulator
Study & Design
- Study Type
- INTERVENTIONAL
- Study Design
- PARALLEL
- Arm && Interventions
Group Intervention Description Experimental Group ICEMS verbal feedback with pause group Experimental Group ICEMS verbal feedback with pause group 43 participants. Individuals receive standard information. They perform 6 5-min practice scenario resections with a 5-min break between each one. The 7th attempt is the 13-min realistic scenario. Participants receive no feedback during the first repetition. They will then receive feedback on 4 metrics, one metric a time: instrument tip separation, low bipolar force, high aspirator force, high bipolar force. Once an attempt is completed without receiving feedback, the next repetition will assess the next metric in the list above. During the 5-min break after an attempt is completed without receiving feedback, participants can watch an optional expert-level demonstration video corresponding to the next metric. Participants receive no feedback during the 6th repetition. They will have no feedback in their 7th repetition, the realistic scenario. Experimental Group ICEMS verbal feedback with pause and expert-level video demonstration Experimental Group ICEMS audio feedback with pause and expert video 43 participants. Individuals receive standard information. They perform 6 5-min practice scenario resections with a 5-min break between each one. The 7th attempt is the 13-min realistic scenario. Participants receive no feedback during the first repetition. They will then receive feedback on 4 metrics, one metric a time: instrument tip separation, low bipolar force, high aspirator force, high bipolar force. Once an attempt is completed without receiving feedback, the next repetition will assess the next metric in the list above. During the 5-min break after an attempt is completed without receiving feedback, participants can watch an optional expert-level demonstration video corresponding to the next metric. Participants receive no feedback during the 6th repetition. They will have no feedback in their 7th repetition, the realistic scenario.
- Primary Outcome Measures
Name Time Method Change in performance 1 day of study Evaluated by comparing the average composite-score, calculated by the ICEMS, from each practice scenario. Scores range from expert/skilled level (a score of 1.00) to novice/less-skilled level (a score of -1.00).
Objective Structured Assessment of Technical Skills (OSATS) global rating scale 1 day of study Performance score of the participants in the complex realistic scenario, assessed by two blinded experts using the Objective Structured Assessment of Technical Skills (OSATS) global rating scale on a 7-point Likert scale (1= novice to 7 = expert). Efficacy in learning with pausing methodology and an expert-level demonstration video will be compared to pausing methodology alone and to no pausing methodology.
Transfer of learning 1 day of study Evaluated by comparing the average composite-score, calculated by the ICEMS, from each practice scenario. Scores range from expert/skilled level (a score of 1.00) to novice/less-skilled level (a score of -1.00).
- Secondary Outcome Measures
Name Time Method Differences in strength of emotions elicited 1 day of study Measured by Duffy's Medical Emotional Scale (MES) before, during, and after learning. Participants will self-report the intensity of each emotion on a 5-point Likert scale (1 = not at all to 5 = very strong).
Difference in Cognitive Load 1 day of study Measured using Leppink's Cognitive Load Index (CLI) after the intervention. Participants will self-report their level of agreement with each statement on a 5-point Likert scale (1 = strongly disagree to 5 = strongly agree).
Trial Locations
- Locations (1)
Neurosurgical Simulation and Artificial Intelligence Learning Centre
🇨🇦Montreal, Quebec, Canada