Simulation-based Assessment to Identify Critical Gaps in Safe Anesthesia Resident Performance

Abstract Background: Valid methods are needed to identify anesthesia resident performance gaps early in training. However, many assessment tools in medicine have not been properly validated. The authors designed and tested use of a behaviorally anchored scale, as part of a multiscenario simulation-based assessment system, to identify high- and low-performing residents with regard to domains of greatest concern to expert anesthesiology faculty. Methods: An expert faculty panel derived five key behavioral domains of interest by using a Delphi process (1) Synthesizes information to formulate a clear anesthetic plan; (2) Implements a plan based on changing conditions; (3) Demonstrates effective interpersonal and communication skills with patients and staff; (4) Identifies ways to improve performance; and (5) Recognizes own limits. Seven simulation scenarios spanning pre-to-postoperative encounters were used to assess performances of 22 first-year residents and 8 fellows from two institutions. Two of 10 trained faculty raters blinded to trainee program and training level scored each performance independently by using a behaviorally anchored rating scale. Residents, fellows, facilitators, and raters completed surveys. Results: Evidence supporting the reliability and validity of the assessment scores was procured, including a high generalizability coefficient (ρ2 = 0.81) and expected performance differences between first-year resident and fellow participants. A majority of trainees, facilitators, and raters judged the assessment to be useful, realistic, and representative of critical skills required for safe practice. Conclusion: The study provides initial evidence to support the validity of a simulation-based performance assessment system for identifying critical gaps in safe anesthesia resident performance early in training.

Download Full-text

Simulation-based Assessment to Reliably Identify Key Resident Performance Attributes

Anesthesiology ◽

10.1097/aln.0000000000002091 ◽

2018 ◽

Vol 128 (4) ◽

pp. 821-831 ◽

Cited By ~ 8

Author(s):

Richard H. Blum ◽

Sharon L. Muret-Wagstaff ◽

John R. Boulet ◽

Jeffrey B. Cooper ◽

Emil R. Petrusa

Keyword(s):

Effect Size ◽

Rating Scale ◽

Program Improvement ◽

First Year ◽

Chi Square ◽

Residency Programs ◽

Resident Performance ◽

Simulation Based ◽

Scenario Assessment ◽

And Training

Abstract Background Obtaining reliable and valid information on resident performance is critical to patient safety and training program improvement. The goals were to characterize important anesthesia resident performance gaps that are not typically evaluated, and to further validate scores from a multiscenario simulation-based assessment. Methods Seven high-fidelity scenarios reflecting core anesthesiology skills were administered to 51 first-year residents (CA-1s) and 16 third-year residents (CA-3s) from three residency programs. Twenty trained attending anesthesiologists rated resident performances using a seven-point behaviorally anchored rating scale for five domains: (1) formulate a clear plan, (2) modify the plan under changing conditions, (3) communicate effectively, (4) identify performance improvement opportunities, and (5) recognize limits. A second rater assessed 10% of encounters. Scores and variances for each domain, each scenario, and the total were compared. Low domain ratings (1, 2) were examined in detail. Results Interrater agreement was 0.76; reliability of the seven-scenario assessment was r = 0.70. CA-3s had a significantly higher average total score (4.9 ± 1.1 vs. 4.6 ± 1.1, P = 0.01, effect size = 0.33). CA-3s significantly outscored CA-1s for five of seven scenarios and domains 1, 2, and 3. CA-1s had a significantly higher proportion of worrisome ratings than CA-3s (chi-square = 24.1, P < 0.01, effect size = 1.50). Ninety-eight percent of residents rated the simulations more educational than an average day in the operating room. Conclusions Sensitivity of the assessment to CA-1 versus CA-3 performance differences for most scenarios and domains supports validity. No differences, by experience level, were detected for two domains associated with reflective practice. Smaller score variances for CA-3s likely reflect a training effect; however, worrisome performance scores for both CA-1s and CA-3s suggest room for improvement.

Download Full-text

Reliability and Validity of a Simulation-based Acute Care Skills Assessment for Medical Students and Residents

Anesthesiology ◽

10.1097/00000542-200312000-00007 ◽

2003 ◽

Vol 99 (6) ◽

pp. 1270-1280 ◽

Cited By ~ 124

Author(s):

John R. Boulet ◽

David Murray ◽

Joe Kras ◽

Julie Woodhouse ◽

John McAllister ◽

...

Keyword(s):

Medical Students ◽

Acute Care ◽

Reliability And Validity ◽

Faculty Members ◽

First Year ◽

Clinical Experiences ◽

Critical Events ◽

Simulation Based ◽

Broad Domain ◽

Simulation Exercises

Background Medical students and residents are expected to be able to manage a variety of critical events after training, but many of these individuals have limited clinical experiences in the diagnosis and treatment of these conditions. Life-sized mannequins that model critical events can be used to evaluate the skills required to manage and treat acute medical conditions. The purpose of this study was to develop and test simulation exercises and associated scoring methods that could be used to evaluate the acute care skills of final-year medical students and first-year residents. Methods The authors developed and tested 10 simulated acute care situations that clinical faculty at a major medical school expects graduating physicians to be able to recognize and treat at the conclusion of training. Forty medical students and residents participated in the evaluation of the exercises. Four faculty members scored the students/residents. Results The reliability of the simulation scores was moderate and was most strongly influenced by the choice and number of simulated encounters. The validity of the simulation scores was supported through comparisons of students'/residents' performances in relation to their clinical backgrounds and experience. Conclusion Acute care skills can be validly and reliably measured using a simulation technology. However, multiple simulated encounters, covering a broad domain, are needed to effectively and accurately estimate student/resident abilities in acute care settings.

Download Full-text

¿Podemos utilizar la simulación para evaluar la formación durante la residencia?

Revista Electrónica AnestesiaR ◽

10.30445/rear.v12i5.825 ◽

2020 ◽

Vol 12 (5) ◽

pp. 1

Author(s):

Ignacio Albero Roselló ◽

Amparo Belltall Olmos ◽

Eva Romero García ◽

Pilar Argente Navarro

Keyword(s):

Clinical Situation ◽

Training Period ◽

First Year ◽

Anesthesia Resident ◽

Knowledge And Skills ◽

Resident Performance ◽

Clinical Simulation ◽

Medical Interns ◽

Performance Scale

Los médicos residentes (MR) durante el periodo de formación deben adquirir conocimientos y habilidades de forma progresiva para abordar cualquier situación clínica. La simulación clínica es una herramienta que permite entrenar los conocimientos y habilidades adquiridas y aplicarlos a situaciones cuasi reales previamente a entrar en contacto con el paciente. El objetivo de este estudio consiste en identificar qué competencias dependen del entrenamiento (año de residencia) y son clave en el rendimiento, y proporcionar más datos para mejorar la validación de la escala HARP (Harvard Assessment of Anesthesia Resident Performance), un instrumento de medición que valora el rendimiento de los residentes en diferentes escenarios simulados. Se realizó un estudio prospectivo observacional donde se seleccionaron cincuenta y un residentes de primer año y dieciséis de tercero de tres programas de formación diferentes. Se diseñaron siete escenarios acordes al nivel formativo donde los residentes asumen el rol de anestesiólogo principal en el desarrollo del caso. Los resultados de este estudio añaden validez a la evidencia de que la simulación puede servir para evaluar el rendimiento de los residentes, así como encontrar vacíos en las destrezas en áreas seleccionadas, incluyendo algunas que no se encuentran en las evaluaciones clásicas. Los resultados obtenidos sugieren que la evaluación con el método HARP, si está adecuadamente estructurada, puede proporcionar información detallada de alta calidad sobre las capacidades clínicas de cada residente, con puntuaciones razonablemente precisas y estandarizadas. ABSTRACT Can we use clinical simulation to evaluate medical interns during the training period? Medical interns/residents during the training period should acquire knowledge and skills progressively to face any clinical situation. Clinical simulation is a tool that allows them to train the knowledge and skills acquired and apply them to almost real situations before coming into contact with the patient. The objective of this study is to identify which skills depend on training (year of residence) and are key in performance, and provide more data to improve the validation of the HARP (Harvard Assessment of Anesthesia Resident Performance) scale, a measuring instrument which values the performance of residents in different simulated scenarios. A prospective observational study was conducted in which fifty-one first-year and sixteen third-year residents were selected from three different training programs. Seven scenarios were designed according to the training level where residents assume the role of main anesthesiologist in the development of the case. The results of this study add validity to the evidence that the simulation can be used to assess the performance of residents, as well as find gaps in skills in selected areas, including some that are not found in the classic assessments. The results obtained suggest that the evaluation with the HARP method, if properly structured, can provide high-quality detailed information on the clinical capabilities of each resident, with reasonably accurate and standardized scores.

Download Full-text

Reliability of Simulation-based Assessment for Practicing Physicians: Performance is Context-Specific

10.21203/rs.3.rs-37022/v2 ◽

2021 ◽

Author(s):

Elizabeth Sinz ◽

Arna Banerjee ◽

Randolph Steadman ◽

Matthew S Shotwell ◽

Jason Slagle ◽

...

Keyword(s):

Reliability And Validity ◽

High Fidelity Simulation ◽

Critical Events ◽

Assessment Scores ◽

Technical Performance ◽

Complex Simulation ◽

Simulation Based ◽

Study Population ◽

Context Specific ◽

Practicing Physicians

Abstract Introduction: Even physicians who routinely work in complex, dynamic practices may be unprepared to optimally manage challenging critical events. High-fidelity simulation can realistically mimic critical clinically relevant events, however the reliability and validity of simulation-based assessment scores for practicing physicians has not been established.Methods: Standardised complex simulation scenarios were developed and administered to board-certified, practicing anesthesiologists who volunteered to participate in an assessment study during formative maintenance of certification activities. A subset of the study population agreed to participate as the primary responder in a second scenario for this study. The physicians were assessed independently by trained raters on both teamwork/behavioural and technical performance measures. Analysis using Generalisability and Decision studies were completed for the two scenarios with two raters.Results: The behavioural score was not more reliable than the technical score. With two raters > 20 scenarios would be required to achieve a reliability estimate of 0.7. Increasing the number of raters for a given scenario would have little effect on reliability.Conclusions: The performance of practicing physicians on simulated critical events may be highly context-specific. Realistic simulation-based assessment for practicing physicians is resource-intensive and may be best-suited for individualized formative feedback. More importantly, aggregate data from a population of participants may have an even higher impact if used to identify skill or knowledge gaps to be addressed by training programs and inform continuing education improvements across the profession.

Download Full-text

Objective Assessment of Technical Skills in Otorhinolaryngology–Head and Neck Surgery Residents: A Systematic Review

Otolaryngology ◽

10.1177/0194599817729826 ◽

2017 ◽

Vol 158 (1) ◽

pp. 54-61 ◽

Cited By ~ 2

Author(s):

Érika Mercier ◽

Ségolène Chagnon-Monarque ◽

François Lavigne ◽

Tareck Ayad

Keyword(s):

Head And Neck ◽

Surgical Procedures ◽

Rating Scale ◽

Reliability And Validity ◽

Neck Surgery ◽

Evaluation Methods ◽

Head And Neck Surgery ◽

Assessment Tools ◽

Global Rating Scale ◽

Global Rating

Objectives The primary goal is the indexation of validated methods used to assess surgical competency in otorhinolaryngology–head and neck surgery (ORL-HNS) residents. Secondary goals include assessment of the reliability and validity of these tools, as well as the documentation of specific procedures in ORL-HNS involved. Data Sources MEDBASE, OVID, Medline, CINAHL, and EBM, as well as the printed references, available through the Université de Montréal library. Review Methods The PRISMA method was used to review digital and printed databases. Publications were reviewed by 2 independent reviewers, and selected articles were fully analyzed to classify evaluation methods and categorize them by procedure and subspecialty of ORL-HNS involved. Reliability and validity were assessed and scored for each assessment tool. Results Through the review of 30 studies, 5 evaluation methods were described and validated to assess the surgical competency of ORL-HNS residents. The evaluation method most often described was the combined Global Rating Scale and Task-Specific Checklist tool. Reliability and validity for this tool were overall high; however, considerable data were unavailable. Eleven distinctive surgical procedures were studied, encompassing many subspecialties of ORL-HNS: facial plastics, general ear-nose-throat, laryngology, otology, pediatrics, and rhinology. Conclusions Although assessment tools have been developed for an array of surgical procedures, involving most ORL-HNS subspecialties, the use of combined checklists has been repeatedly validated in the literature and shown to be easily applicable in practice. It has been applied to many ORL-HNS procedures but not in oncologic surgery to date.

Download Full-text

Reliability of Simulation-Based Assessment for Practicing Physicians: Performance is Context-Specific

10.21203/rs.3.rs-37022/v1 ◽

2020 ◽

Author(s):

Elizabeth Sinz ◽

Arna Banerjee ◽

Randolph Steadman ◽

Matthew S Shotwell ◽

Jason Slagle ◽

...

Keyword(s):

Reliability And Validity ◽

Critical Events ◽

Assessment Scores ◽

Technical Performance ◽

Complex Simulation ◽

Simulation Based ◽

Study Population ◽

Behavioral Score ◽

Context Specific ◽

Practicing Physicians

Abstract Introduction: Even physicians who routinely work in complex, dynamic practices may be unprepared to optimally manage challenging critical events. High-fidelity simulation can realistically mimic critical clinically relevant events, however the reliability and validity of simulation-based assessment scores for practicing physicians has not been established.Methods: Standardized complex simulation scenarios were developed and administered to board-certified, practicing anesthesiologists who volunteered to participate in an assessment study during formative maintenance of certification activities. A subset of the study population agreed to participate as the primary responder in a second scenario for this study. The physicians were assessed independently by trained raters on both teamwork/behavioral and technical performance measures. Analysis using Generalizability and Decision studies were completed for the two scenarios with two raters.Results: The technical score was not more reliable than the behavioral score. With two raters > 20 scenarios would be required to achieve a reliability estimate of 0.7. Increasing the number of raters would have little effect on reliability.Discussion: The performance of practicing physicians on simulated critical events may be highly context-specific. Realistic simulation-based assessment for practicing physicians is resource-intensive and may be best-suited for individualized formative feedback. Moreover, aggregate data from a population of participants may yield even higher impact if used to identify skill or knowledge gaps to be addressed by training programs and continuing education improvements across the profession.

Download Full-text

Are the current pain assessment tools used by paramedics in Ireland, suitable for use with cognitively impaired (dementia) patients?

Irish Journal of Paramedicine ◽

10.32378/ijp.v3i2.143 ◽

2018 ◽

Vol 3 (2) ◽

Author(s):

Liam Rooney

Keyword(s):

Pain Assessment ◽

Rating Scale ◽

Assessment Tool ◽

Hospital Setting ◽

Reliability And Validity ◽

Pain Scale ◽

Assessment Tools ◽

Self Report ◽

Good Reliability ◽

Dementia Patients

BackgroundDementia is a disease affecting 55,000 Irish people. (1) It is characterised by progressive cognitive impairment, ranging from mild impairment, which may affect memory, to severe impairment where the ability to communicate may be absent. These people are at risk of having their pain underassessed and undermanaged. (2) A survey exploring Irish Paramedics and Advanced Paramedics views on the current pain assessment tools available to them, and whether these tools are suitable for use with dementia patients is proposed. Existing observational pain assessment tools used with dementia patients are examined and their suitability for pre-hospital use discussed.IntroductionAdults with cognitive impairments, such as dementia, are at a much higher risk of not receiving adequate analgesia for their pain. (3) It is estimated between 40% and 80% of dementia patients regularly experience pain. (4) Current pain assessment tools used pre-hospital in Ireland are: Numerical Rating Scale for patients >8yrs, Wong Baker Scale for pediatric patients and the FLACC Scale for infants. There is no specific pain assessment tool for use with patients who are not capable of self-reporting their level of pain.ObjectiveThis research aimed to identify observational pain assessment tools used in this cohort. The most consistently recommended tools were identified. The suitability of these tools for use in the pre-hospital setting assessed.FindingsLiterature review identified 29 observational pain assessment tools. There is a lack of literature relating to the pre-hospital setting. The American Geriatric Society (AGS) identified six pain behaviors in dementia patients, changes in facial expression, activity patterns, interpersonal relationships and mental status, negative vocalisation, change in body language. These six criteria should be the foundation of any pain assessment tool. (5) The three most consistently recommended tools identified were as follows:Abbey Pain Scale6 items assessed, meets AGS criteria, quick and easy to implement, moderate to good reliability and validity (6)Doloplus 215 items assessed, meets 5 of 6 AGS criteria, requires observation over time, prior knowledge of patient required, moderate to good reliability and validity (6)PAINAD5 items assessed, meets 3 of 6 AGS criteria, less then 5 minutes to implement, may be influenced by psychological distress, good reliability and validity (6) ConclusionThe ability to self report pain is deemed “gold standard”. Patients with mild to moderate disease, and indeed, some with severe disease, may retain the ability to self report. An observational tool is required when dementia has progressed to the point where the patient becomes unable to self report or becomes non-verbal. It is in these patients where undetected, misinterpreted or inaccurate assessment of pain becomes frequent. (7) The aim of any tool is to gain a good assessment of pain, however, the pain scale used should be suitable to the clinical setting. The feasibility of an assessment tool is an important factor along with reliability and validity. No one assessment tool could be recommended over another. Abbey and PAINAD have potential for use pre-hospital, however, further research, clinical evaluation and trial in an ambulance service is required.

Download Full-text

Assessment of Clinical Reasoning by Listening to Case Presentations: VSOP Method for Better Feedback

Journal of Medical Education and Curricular Development ◽

10.4137/jmecd.s30035 ◽

2016 ◽

Vol 3 ◽

pp. JMECD.S30035 ◽

Cited By ~ 2

Author(s):

Hirotaka Onishi

Keyword(s):

Clinical Reasoning ◽

Clinical Education ◽

Teaching And Learning ◽

Rating Scale ◽

Assessment Tool ◽

Reliability And Validity ◽

Assessment Tools ◽

The One ◽

Case Presentations ◽

Almost All

Case presentation is used as a teaching and learning tool in almost all clinical education, and it is also associated with clinical reasoning ability. Despite this, no specific assessment tool utilizing case presentations has yet been established. SNAPPS (summarize, narrow, analyze, probe, plan, and select) and the One-minute Preceptor are well-known educational tools for teaching how to improve consultations. However, these tools do not include a specific rating scale to determine the diagnostic reasoning level. Mini clinical evaluation exercise (Mini-CEX) and RIME (reporter, interpreter, manager, and educator) are comprehensive assessment tools with appropriate reliability and validity. The vague, structured, organized and pertinent (VSOP) model, previously proposed in Japan and derived from RIME model, is a tool for formative assessment and teaching of trainees through case presentations. Uses of the VSOP model in real settings are also discussed.

Download Full-text

Reliability of simulation-based assessment for practicing physicians: performance is context-specific

BMC Medical Education ◽

10.1186/s12909-021-02617-8 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Elizabeth Sinz ◽

Arna Banerjee ◽

Randolph Steadman ◽

Matthew S. Shotwell ◽

Jason Slagle ◽

...

Keyword(s):

Reliability And Validity ◽

High Fidelity Simulation ◽

Critical Events ◽

Assessment Scores ◽

Technical Performance ◽

Complex Simulation ◽

Simulation Based ◽

Study Population ◽

Context Specific ◽

Practicing Physicians

Abstract Introduction Even physicians who routinely work in complex, dynamic practices may be unprepared to optimally manage challenging critical events. High-fidelity simulation can realistically mimic critical clinically relevant events, however the reliability and validity of simulation-based assessment scores for practicing physicians has not been established. Methods Standardised complex simulation scenarios were developed and administered to board-certified, practicing anesthesiologists who volunteered to participate in an assessment study during formative maintenance of certification activities. A subset of the study population agreed to participate as the primary responder in a second scenario for this study. The physicians were assessed independently by trained raters on both teamwork/behavioural and technical performance measures. Analysis using Generalisability and Decision studies were completed for the two scenarios with two raters. Results The behavioural score was not more reliable than the technical score. With two raters > 20 scenarios would be required to achieve a reliability estimate of 0.7. Increasing the number of raters for a given scenario would have little effect on reliability. Conclusions The performance of practicing physicians on simulated critical events may be highly context-specific. Realistic simulation-based assessment for practicing physicians is resource-intensive and may be best-suited for individualized formative feedback. More importantly, aggregate data from a population of participants may have an even higher impact if used to identify skill or knowledge gaps to be addressed by training programs and inform continuing education improvements across the profession.

Download Full-text

Reliability and Validity of the Face, Legs, Activity, Cry, Consolability Behavioral Tool in Assessing Acute Pain in Critically Ill Patients

American Journal of Critical Care ◽

10.4037/ajcc2010624 ◽

2010 ◽

Vol 19 (1) ◽

pp. 55-61 ◽

Cited By ~ 129

Author(s):

Terri Voepel-Lewis ◽

Jennifer Zanotti ◽

Jennifer A. Dammeyer ◽

Sandra Merkel

Keyword(s):

Critically Ill ◽

Rating Scale ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Assessment Tools ◽

Self Report ◽

The Face ◽

Critically Ill Adults ◽

Ill Adults

Background Few investigators have evaluated pain assessment tools in the critical care setting.Objective To evaluate the reliability and validity of the Face, Legs, Activity, Cry, Consolability (FLACC) Behavioral Scale in assessing pain in critically ill adults and children unable to self-report pain.Methods Three nurses simultaneously, but independently, observed and scored pain behaviors twice in 29 critically ill adults and 8 children: before administration of an analgesic or during a painful procedure, and 15 to 30 minutes after the administration or procedure. Two nurses used the FLACC scale, the third used either the Checklist of Nonverbal Pain Indicators (for adults) or the COMFORT scale (for children).Results For 73 observations, FLACC scores correlated highly with the other 2 scores (ρ = 0.963 and 0.849, respectively), supporting criterion validity. Significant decreases in FLACC scores after analgesia (or at rest) supported construct validity of the tool (mean, 5.27; SD, 2.3 vs mean, 0.52; SD, 1.1; P < .001). Exact agreement and κ statistics, as well as intraclass correlation coefficients (0.67–0.95), support excellent interrater reliability of the tool. Internal consistency was excellent; the Cronbach α was 0.882 when all items were included.Conclusions Although similar in content to other behavioral pain scales, the FLACC can be used across populations of patients and settings, and the scores are comparable to those of the commonly used 0-to-10 number rating scale.

Download Full-text