scholarly journals Evaluating Teamwork in a Simulated Obstetric Environment

2007 ◽  
Vol 106 (5) ◽  
pp. 907-915 ◽  
Author(s):  
Pamela J. Morgan ◽  
Richard Pittini ◽  
Glenn Regehr ◽  
Carol Marrs ◽  
Michèle F. Haley

Background The National Confidential Enquiry into Maternal Deaths identified "lack of communication and teamwork" as a leading cause of substandard obstetric care. The authors used high-fidelity simulation to present obstetric scenarios for team assessment. Methods Obstetric nurses, physicians, and resident physicians were repeatedly assigned to teams of five or six, each team managing one of four scenarios. Each person participated in two or three scenarios with differently constructed teams. Participants and nine external raters rated the teams' performances using a Human Factors Rating Scale (HFRS) and a Global Rating Scale (GRS). Interrater reliability was determined using intraclass correlations and the Cronbach alpha. Analyses of variance were used to determine the reliability of the two measures, and effects of both scenario and rater profession (R.N. vs. M.D.) on scores. Pearson product-moment correlations were used to compare external with self-generated assessments. Results The average of nine external rater scores showed good reliability for both HFRS and GRS; however, the intraclass correlation coefficients for a single rater was low. There was some effect of rater profession on self-generated HFRS but not on GRS. An analysis of profession-specific subscores on the HFRS revealed no interaction between profession of rater and profession being rated. There was low correlation between externally and self-generated team assessments. Conclusions This study does not support the use of the HFRS for assessment of obstetric teams. The GRS shows promise as a summative but not a formative assessment tool. It is necessary to develop a domain specific behavioral marking system for obstetric teams.

2016 ◽  
Vol 2016 ◽  
pp. 1-13 ◽  
Author(s):  
Richard R. McNeer ◽  
Roman Dudaryk ◽  
Nicholas B. Nedeff ◽  
Christopher L. Bennett

Introduction. Medical simulators are used for assessing clinical skills and increasingly for testing hypotheses. We developed and tested an approach for assessing performance in anesthesia residents using screen-based simulation that ensures expert raters remain blinded to subject identity and experimental condition.Methods. Twenty anesthesia residents managed emergencies in an operating room simulator by logging actions through a custom graphical user interface. Two expert raters rated performance based on these entries using custom Global Rating Scale (GRS) and Crisis Management Checklist (CMC) instruments. Interrater reliability was measured by calculating intraclass correlation coefficients (ICC), and internal consistency of the instruments was assessed with Cronbach’s alpha. Agreement between GRS and CMC was measured using Spearman rank correlation (SRC).Results. Interrater agreement (GRS: ICC = 0.825, CMC: ICC = 0.878) and internal consistency (GRS: alpha = 0.838, CMC: alpha = 0.886) were good for both instruments. Subscale analysis indicated that several instrument items can be discarded. GRS and CMC scores were highly correlated (SRC = 0.948).Conclusions. In this pilot study, we demonstrated that screen-based simulation can allow blinded assessment of performance. GRS and CMC instruments demonstrated good rater agreement and internal consistency. We plan to further test construct validity of our instruments by measuring performance in our simulator as a function of training level.


Author(s):  
M Stavrakas ◽  
G Menexes ◽  
S Triaridis ◽  
P Bamidis ◽  
J Constantinidis ◽  
...  

Abstract Objective This study developed an assessment tool that was based on the objective structured assessment for technical skills principles, to be used for evaluation of surgical skills in cortical mastoidectomy. The objective structured assessment of technical skill is a well-established tool for evaluation of surgical ability. This study also aimed to identify the best material and printing method to make a three-dimensional printed temporal bone model. Methods Twenty-four otolaryngologists in training were asked to perform a cortical mastoidectomy on a three-dimensional printed temporal bone (selective laser sintering resin). They were scored according to the objective structured assessment of technical skill in temporal bone dissection tool developed in this study and an already validated global rating scale. Results Two external assessors scored the candidates, and it was concluded that the objective structured assessment of technical skill in temporal bone dissection tool demonstrated some main aspects of validity and reliability that can be used in training and performance evaluation of technical skills in mastoid surgery. Conclusion Apart from validating the new tool for temporal bone dissection training, the study showed that evolving three-dimensional printing technologies is of high value in simulation training with several advantages over traditional teaching methods.


2017 ◽  
Vol 45 (9) ◽  
pp. 2125-2130 ◽  
Author(s):  
Lisa Phillips ◽  
Jeffrey J.H. Cheung ◽  
Daniel B. Whelan ◽  
Michael Lucas Murnaghan ◽  
Jas Chahal ◽  
...  

Background: Arthroscopic hip labral repair is a technically challenging and demanding surgical technique with a steep learning curve. Arthroscopic simulation allows trainees to develop these skills in a safe environment. Purpose: The purpose of this study was to evaluate the use of a combination of assessment ratings for the performance of arthroscopic hip labral repair on a dry model. Study Design: Cross-sectional study; Level of evidence, 3. Methods: A total of 47 participants including orthopaedic surgery residents (n = 37), sports medicine fellows (n = 5), and staff surgeons (n = 5) performed arthroscopic hip labral repair on a dry model. Prior arthroscopic experience was noted. Participants were evaluated by 2 orthopaedic surgeons using a task-specific checklist, the Arthroscopic Surgical Skill Evaluation Tool (ASSET), task completion time, and a final global rating scale. All procedures were video-recorded and scored by an orthopaedic fellow blinded to the level of training of each participant. Results: The internal consistency/reliability (Cronbach alpha) using the total ASSET score for the procedure was high (intraclass correlation coefficient > 0.9). One-way analysis of variance for the total ASSET score demonstrated a difference between participants based on the level of training ( F3,43 = 27.8, P < .001). A good correlation was seen between the ASSET score and previous exposure to arthroscopic procedures ( r = 0.52-0.73, P < .001). The interrater reliability for the ASSET score was excellent (>0.9). Conclusion: The results of this study demonstrate that the use of dry models to assess the performance of arthroscopic hip labral repair by trainees is both valid and reliable. Further research will be required to demonstrate a correlation with performance on cadaveric specimens or in the operating room.


2006 ◽  
Vol 104 (3) ◽  
pp. 475-481 ◽  
Author(s):  
Georges L. Savoldelli ◽  
Viren N. Naik ◽  
Hwan S. Joo ◽  
Patricia L. Houston ◽  
Marianne Graham ◽  
...  

Background Patient simulators possess features for performance assessment. However, the concurrent validity and the "added value" of simulator-based examinations over traditional examinations have not been adequately addressed. The current study compared a simulator-based examination with an oral examination for assessing the management skills of senior anesthesia residents. Methods Twenty senior anesthesia residents were assessed sequentially in resuscitation and trauma scenarios using two assessment modalities: an oral examination, followed by a simulator-based examination. Two independent examiners scored the performances with a previously validated global rating scale developed by the Anesthesia Oral Examination Board of the Royal College of Physicians and Surgeons of Canada. Different examiners were used to rate the oral and simulation performances. Results Interrater reliability was good to excellent across scenarios and modalities: intraclass correlation coefficients ranged from 0.77 to 0.87. The within-scenario between-modality score correlations (concurrent validity) were moderate: r = 0.52 (resuscitation) and r = 0.53 (trauma) (P &lt; 0.05). Forty percent of the average score variance was accounted for by the participants, and 30% was accounted for by the participant-by-modality interaction. Conclusions Variance in participant scores suggests that the examination is able to perform as expected in terms of discriminating among test takers. The rather large participant-by-modality interaction, along with the pattern of correlations, suggests that an examinee's performance varies based on the testing modality and a trainee who "knows how" in an oral examination may not necessarily be able to "show how" in a simulation laboratory. Simulation may therefore be considered a useful adjunct to the oral examination.


2020 ◽  
pp. 1-13
Author(s):  
Louise Capling ◽  
Janelle A. Gifford ◽  
Kathryn L. Beck ◽  
Victoria M. Flood ◽  
Fiona Halar ◽  
...  

Abstract Diet quality indices are a practical, cost-effective method to evaluate dietary patterns, yet few have investigated diet quality in athletes. This study describes the relative validity and reliability of the recently developed Athlete Diet Index (ADI). Participants completed the electronic ADI on two occasions, 2 weeks apart, followed by a 4-d estimated food record (4-dFR). Relative validity was evaluated by directly comparing mean scores of the two administrations (mAdm) against scores derived from 4-dFR using Spearman’s rank correlation coefficient and Bland–Altman (B–A) plots. Construct validity was investigated by comparing mAdm scores and 4-dFR-derived nutrient intakes using Spearman’s coefficient and independent t test. Test–retest reliability was assessed using paired t test, intraclass correlation coefficients (ICC) and B–A plots. Sixty-eight elite athletes (18·8 (sd 4·2) years) from an Australian sporting institute completed the ADI on both occasions. Mean score was 84·1 (sd 15·2; range 42·5–114·0). The ADI had good reliability (ICC = 0·80, 95 % CI 0·69, 0·87; P < 0·001), and B–A plots (mean 1·9; level of agreement −17·8, 21·7) showed no indication of systematic bias (y = 4·57–0·03 × x) (95 % CI −0·2, 0·1; P = 0·70). Relative validity was evaluated in fifty athletes who completed all study phases. Comparison of mAdm scores with 4-dFR-derived scores was moderate (rs 0·69; P < 0·001) with no systematic bias between methods of measurement (y = 6·90–0·04 × x) (95 % CI −0·3, 0·2; P = 0·73). Higher scores were associated with higher absolute nutrient intake consistent with a healthy dietary pattern. The ADI is a reliable tool with moderate validity, demonstrating its potential for application to investigate the diet quality of athletes.


2019 ◽  
Vol 6 (4) ◽  
pp. 235-238
Author(s):  
Kate E Hughes ◽  
Patrick G Hughes ◽  
Thomas Cahir ◽  
Jennifer Plitt ◽  
Vivienne Ng ◽  
...  

Closed-loop communication (CLC) improves task efficiency and decreases medical errors; however, limited literature on strategies to improve real-time use exist. The primary objective was whether blindfolding a resuscitation leader was effective to improve crisis resource management (CRM) skills, as measured by increased frequency of CLC. Secondary objectives included whether blindfolding affected overall CRM performance or perceived task load. Participants included emergency medicine (EM) or EM/paediatric dual resident physicians. Participants completed presurveys, were block randomised into intervention (blindfolded) or control groups, lead both adult and paediatric resuscitations and completed postsurveys before debriefing. Video recordings of the simulations were reviewed by simulation fellowship-trained EM physicians and rated using the Ottawa CRM Global Rating Scale (GRS). Frequency of CLC was assessed by one rater via video review. Summary statistics were performed. Intraclass correlation coefficient was calculated. Data were analysed using R program for analysis of variance and regression analysis. There were no significant differences between intervention and control groups in any Ottawa CRM GRS category. Postgraduate year (PGY) significantly impacts all Ottawa GRS categories. Frequency of CLC use significantly increased in the blindfolded group (31.7, 95% CI 29.34 to 34.1) vs the non-blindfolded group (24.6, 95% CI 21.5 to 27.7). Participant’s self-rated perceived NASA Task Load Index scores demonstrated no difference between intervention and control groups via a Wilcoxon rank sum test. Blindfolding the resuscitation leader significantly increases frequency of CLC. The blindfold code training exercise is an advanced technique that may increase the use of CLC.


2017 ◽  
Vol 45 (4) ◽  
pp. 469-475 ◽  
Author(s):  
T. Jirativanont ◽  
K. Raksamani ◽  
N. Aroonpruksakul ◽  
P. Apidechakul ◽  
S. Suraseranivongse

We sought to evaluate the validity of two non-technical skills evaluation instruments, the Anaesthetists’ Non-Technical Skills (ANTS) behavioural marker system and the Ottawa Global Rating Scale (GRS), to apply them to anaesthesia training. The content validity, response process, internal structure, relations with other variables and consequences were described for validity evidence. Simulated crisis management sessions were initiated during which two trained raters evaluated the performance of postgraduate first-, second- and third-year (PGY-1, PGY-2 and PGY-3) anaesthesia residents. The study included 70 participants, composed of 24 PGY-1, 24 PGY-2 and 22 PGY-3 residents. Both instruments differentiated the non-technical skills of PGY-1 from PGY-3 residents (P <0.05). Inter-rater agreement was measured using the intraclass correlation coefficient. For the ANTS instrument, the intraclass correlation coefficients for task management, team-working, situation awareness and decision-making were 0.79, 0.34, 0.81 and 0.70, respectively. For the Ottawa GRS, the intraclass correlation coefficients for overall performance, leadership, problem-solving, situation awareness, resource utilisation and communication skills were 0.86, 0.83, 0.84, 0.87, 0.80 and 0.86, respectively. The Cronbach's alpha for internal consistency of the ANTS instrument was 0.93, and was 0.96 for the Ottawa GRS. There was a high correlation between the ANTS and Ottawa GRS. The raters reported the ease of use of the Ottawa GRS compared to the ANTS. We found sufficient evidence of validity in the ANTS instrument and the Ottawa GRS for the evaluation of non-technical skills in a simulated anaesthesia setting, but the Ottawa GRS was more practical and had higher reliability.


2019 ◽  
Vol 07 (05) ◽  
pp. E678-E684 ◽  
Author(s):  
Michael Scaffidi ◽  
Catharine Walsh ◽  
Rishad Khan ◽  
Colleen Parker ◽  
Ahmed Al-Mazroui ◽  
...  

Abstract Background and study aims Novice endoscopists are inaccurate in self-assessment of procedures. One means of improving self-assessment accuracy is through video-based feedback. We aimed to determine the comparative effectiveness of three video-based interventions on novice endoscopists’ self-assessment accuracy of endoscopic competence. Materials and methods Novice endoscopists (performed < 20 previous procedures) were recruited. Participants completed a simulated esophagogastroduodenoscopy (EGD) on a virtual reality simulator. They were then randomized to one of three groups: self-video review (SVR), which involved watching a recorded video of their own performance; benchmark review (BVR), which involved watching a video of a simulated EGD completed by an expert; and self- and benchmark video (SBVR), which involved both videos. Participants then completed two additional simulated EGD cases. Self-assessments were conducted immediately after the first procedure, after the video intervention and after the additional two procedures. External assessments were conducted by two experienced endoscopists, who were blinded to participant identity and group assignment through video recordings. External and self-assessments were completed using the global rating scale component of the Gastrointestinal Endoscopy Competency Assessment Tool (GiECAT GRS). Results Fifty-one participants completed the study. The BVR group had significantly improved self-assessment accuracy in the short-term, compared to the SBVR group (P = .005). The SBVR group demonstrated significantly improved self-assessment accuracy over time (P = .016). There were no significant effects of group or of time for the SVR group. Conclusions Video-based interventions, particularly combined use of self- and benchmark video review, can improve accuracy of self-assessment of endoscopic competence among novices.


2016 ◽  
Vol 13 (1) ◽  
pp. 60-68 ◽  
Author(s):  
Gerben E. Breimer ◽  
Faizal A. Haji ◽  
Giuseppe Cinalli ◽  
Eelco W. Hoving ◽  
James M. Drake

Abstract BACKGROUND: Growing demand for transparent and standardized methods for evaluating surgical competence prompted the construction of the Neuro-Endoscopic Ventriculostomy Assessment Tool (NEVAT). OBJECTIVE: To provide validity evidence of the NEVAT by reporting on the tool's internal structure and its relationship with surgical expertise during simulation-based training. METHODS: The NEVAT was used to assess performance of trainees and faculty at an international neuroendoscopy workshop. All participants performed an endoscopic third ventriculostomy (ETV) on a synthetic simulator. Participants were simultaneously scored by 2 raters using the NEVAT procedural checklist and global rating scale (GRS). Evidence of internal structure was collected by calculating interrater reliability and internal consistency of raters' scores. Evidence of relationships with other variables was collected by comparing the ETV performance of experts, experienced trainees, and novices using Jonckheere's test (evidence of construct validity). RESULTS: Thirteen experts, 11 experienced trainees, and 10 novices participated. The interrater reliability by the intraclass correlation coefficient for the checklist and GRS was 0.82 and 0.94, respectively. Internal consistency (Cronbach's α) for the checklist and the GRS was 0.74 and 0.97, respectively. Median scores with interquartile range on the checklist and GRS for novices, experienced trainees, and experts were 0.69 (0.58-0.86), 0.85 (0.63-0.89), and 0.85 (0.81-0.91) and 3.1 (2.5-3.8), 3.7 (2.2-4.3) and 4.6 (4.4-4.9), respectively. Jonckheere's test showed that the median checklist and GRS score increased with performer expertise (P = .04 and .002, respectively). CONCLUSION: This study provides validity evidence for the NEVAT to support its use as a standardized method of evaluating neuroendoscopic competence during simulation-based training.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257887
Author(s):  
Wen-Cheng Huang ◽  
Shih-Chang Hsu ◽  
Chih-Hao Yang ◽  
Che-Wei Lin ◽  
Fat-Moon Suk ◽  
...  

Study objective The purpose of this feasibility study is to develop and validate a new assessment tool and scoring system for multitasking competency for physicians in-training in a timed simulated setting. The multitasking competency includes ability to appropriately prioritize and implement tasks for different patients who present simultaneously. Methods We designed three single task stations with different levels of difficulty and priority. These skill stations were then combined to create a multitasking simulation scenario. Skill checklists and the global rating scale were utilized to assess the participants’ performance. A multitasking score, multitasking index, and priority score were developed to measure the multitasking ability of participants. Results Thirty-three first-year postgraduate physicians were recruited for this prospective study. The total performance scores were significantly higher for the single-tasking stations than for the multitasking scenario. In terms of the time needed to complete the tasks, the participants spent more time on the multitasking scenario than on the single-tasking scenario. There were significant correlations between the global rating scale and the multitasking score (rho = 0.693, p < 0.001) and between the global rating scale and the multitasking index (rho = 0.515, p < 0.001). The multitasking score, multitasking index, and priority score did not have any significant correlations with the total single-tasking score. Conclusion We demonstrated that the use of a simulated multitasking scenario could be an effective method of assessing multitasking ability and allow assessors to offer better quality feedback.


Sign in / Sign up

Export Citation Format

Share Document