Documentation of Clinical Reasoning in Admission Notes of Hospitalists: Validation of the CRANAPL Assessment Rubric

2019 ◽  
Vol 14 (12) ◽  
pp. 746-753
Author(s):  
Susrutha Kotwal

OBJECTIVE: To establish a metric for evaluating hospitalists’ documentation of clinical reasoning in admission notes. STUDY DESIGN: Retrospective study. SETTING: Admissions from 2014 to 2017 at three hospitals in Maryland. PARTICIPANTS: Hospitalist physicians. MEASUREMENTS: A subset of patients admitted with fever, syncope/dizziness, or abdominal pain were randomly selected. The nine-item Clinical Reasoning in Admission Note Assessment & Plan (CRANAPL) tool was developed to assess the comprehensiveness of clinical reasoning documented in the assessment and plans (A&Ps) of admission notes. Two authors scored all A&Ps by using this tool. A&Ps with global clinical reasoning and global readability/clarity measures were also scored. All data were deidentified prior to scoring. RESULTS: The 285 admission notes that were evaluated were authored by 120 hospitalists. The mean total CRANAPL score given by both raters was 6.4 (SD 2.2). The intraclass correlation measuring interrater reliability for the total CRANAPL score was 0.83 (95% CI, 0.76-0.87). Associations between the CRANAPL total score and global clinical reasoning score and global readability/clarity measures were statistically significant (P < .001). Notes from academic hospitals had higher CRANAPL scores (7.4 [SD 2.0] and 6.6 [SD 2.1]) than those from the community hospital (5.2 [SD 1.9]), P < .001. CONCLUSIONS: This study represents the first step to characterizing clinical reasoning documentation in hospital medicine. With some validity evidence established for the CRANAPL tool, it may be possible to assess the documentation of clinical reasoning by hospitalists.

2019 ◽  
Vol 28 (11) ◽  
pp. 925-933 ◽  
Author(s):  
Charles Kassardjian ◽  
Yoon Soo Park ◽  
Sherri Braksick ◽  
Jeremy Cutsforth-Gregory ◽  
Carrie Robertson ◽  
...  

ObjectivesTo develop neurology scenarios for use with the Quality Improvement Knowledge Application Tool Revised (QIKAT-R), gather and evaluate validity evidence, and project the impact of scenario number, rater number and rater type on score reliability.MethodsSix neurological case scenarios were developed. Residents were randomly assigned three scenarios before and after a quality improvement (QI) course in 2015 and 2016. For each scenario, residents crafted an aim statement, selected a measure and proposed a change to address a quality gap. Responses were scored by six faculty raters (two with and four without QI expertise) using the QIKAT-R. Validity evidence from content, response process, internal structure, relations to other variables and consequences was collected. A generalisability (G) study examined sources of score variability, and decision analyses estimated projected reliability for different numbers of raters and scenarios and raters with and without QI expertise.ResultsRaters scored 163 responses from 28 residents. The mean QIKAT-R score was 5.69 (SD 1.06). G-coefficient and Phi-coefficient were 0.65 and 0.60, respectively. Interrater reliability was fair for raters without QI expertise (intraclass correlation = 0.53, 95% CI 0.30 to 0.72) and acceptable for raters with QI expertise (intraclass correlation = 0.66, 95% CI 0.02 to 0.88). Postcourse scores were significantly higher than precourse scores (6.05, SD 1.48 vs 5.22, SD 1.5; p < 0.001). Sufficient reliability for formative assessment (G-coefficient > 0.60) could be achieved by three raters scoring six scenarios or two raters scoring eight scenarios, regardless of rater QI expertise.ConclusionsValidity evidence was sufficient to support the use of the QIKAT-R with multiple scenarios and raters to assess resident QI knowledge application for formative or low-stakes summative purposes. The results provide practical information for educators to guide implementation decisions.


2018 ◽  
Vol 25 (3) ◽  
pp. 286-290 ◽  
Author(s):  
Elif Bilgic ◽  
Madoka Takao ◽  
Pepa Kaneva ◽  
Satoshi Endo ◽  
Toshitatsu Takao ◽  
...  

Background. Needs assessment identified a gap regarding laparoscopic suturing skills targeted in simulation. This study collected validity evidence for an advanced laparoscopic suturing task using an Endo StitchTM device. Methods. Experienced (ES) and novice surgeons (NS) performed continuous suturing after watching an instructional video. Scores were based on time and accuracy, and Global Operative Assessment of Laparoscopic Surgery. Data are shown as medians [25th-75th percentiles] (ES vs NS). Interrater reliability was calculated using intraclass correlation coefficients (confidence interval). Results. Seventeen participants were enrolled. Experienced surgeons had significantly greater task (980 [964-999] vs 666 [391-711], P = .0035) and Global Operative Assessment of Laparoscopic Surgery scores (25 [24-25] vs 14 [12-17], P = .0029). Interrater reliability for time and accuracy were 1.0 and 0.9 (0.74-0.96), respectively. All experienced surgeons agreed that the task was relevant to practice. Conclusion. This study provides validity evidence for the task as a measure of laparoscopic suturing skill using an automated suturing device. It could help trainees acquire the skills they need to better prepare for clinical learning.


2020 ◽  
Vol 48 (3) ◽  
pp. 642-646
Author(s):  
Daniel W. Green ◽  
Sreetha Sidharthan ◽  
Lindsay M. Schlichte ◽  
Alexandra H. Aitchison ◽  
Douglas N. Mintz

Background: Osgood-Schlatter disease (OSD) is a traction apophysitis of the tibial tubercle caused by repetitive strain and chronic avulsion from the patellar tendon. No widely accepted anatomic risk factors have been associated with OSD. Purpose: To determine if OSD is associated with increased posterior tibial slope (PTS). Study Design: Cross-sectional study; Level of evidence, 3. Methods: Forty knees with OSD and 32 control knees examined by the senior author between 2008 and 2019 were included. Patients 10 to 15 years of age with a clinical diagnosis of OSD and available lateral radiograph and magnetic resonance imaging (MRI) were eligible. Age- and sex-matched patients with a history of knee pain but no evidence of OSD on clinical examination and without significant pathology on lateral radiograph and MRI were included in the control group. PTS was defined as the angle between a reference line perpendicular to the proximal anatomic axis and a line drawn tangent to the uppermost anterior and posterior edges of the medial tibial plateau. Measurements were carried out in duplicate on true lateral radiographs by 2 blinded investigators. Interrater reliability of PTS measurements was evaluated using intraclass correlation coefficient (ICC). The independent samples t test was used to compare PTS between the OSD and control knees. Results: The mean age was 12.6 ± 1.6 years and 51% (37/72) of the knees were from male youth. There were no differences in age, sex, and laterality of knees between the OSD and control groups. The mean PTS was significantly higher in the OSD group (12.23°± 3.58°) compared with the control group (8.82°± 2.76°; P < .001). The ICC was 0.931 (95% CI, 0.890-0.957), indicating almost perfect interrater reliability. Conclusion: This study identifies an association between OSD and increased PTS. The clinical implications of this finding have not yet been elucidated. It may be speculated that in patients with OSD, stress from the extensor mechanism through the patellar tendon loads the anterior portion of the tibia disproportionately to the posterior segment, thereby resulting in asymmetric growth and an increased PTS.


Hand ◽  
2017 ◽  
Vol 13 (6) ◽  
pp. 671-677 ◽  
Author(s):  
Joseph Said ◽  
Kevin Baker ◽  
Laviel Fernandez ◽  
David E. Komatsu ◽  
Elaine Gould ◽  
...  

Background: Lack of a universally accepted location and normal value limits the utility of measuring scapholunate diastasis. The primary purpose of this study was to define the optimal location to measure the scapholunate gap throughout sequential ligament transections. Secondary purposes were to compare plain radiographs with fluoroscopy and to evaluate interrater reliability in measuring scapholunate diastasis. Methods: Five cadaver forearms were imaged with intact carpal ligaments and after sequentially transecting the scapholunate, radioscaphocapitate, and scaphotrapezium-trapezoid ligaments. Plain radiographs and static fluoroscopic images were obtained with wrists in neutral and 30° ulnar deviation for each stage. Multiple reviewers performed measurements of the scapholunate interval at 3 separate locations. Mean distances were calculated and pairwise comparisons between groups were made. Intraclass correlation was calculated to determine interrater reliability. Results: Overall, measurements made in the middle of the scapholunate joint had the smallest margins of error for all imaging modalities, ligament disruptions, and wrist positions. For normal wrists, the mean scapholunate measurements were all less than 2.0 mm at the middle of the joint, regardless of imaging modality or wrist position. Fluoroscopy detected significance between more stages of instability than plain radiographs at the middle of the joint. Conclusions: Measurements in the middle of the scapholunate joint in neutral and 30° of ulnar deviation under fluoroscopic imaging best capture all stages of ligamentous disruptions. Measurements less than 2.0 mm at the middle of the scapholunate interval may be considered within normal range.


2018 ◽  
Vol 10 (3) ◽  
pp. 269-275 ◽  
Author(s):  
Shalini T. Reddy ◽  
Ara Tekian ◽  
Steven J. Durning ◽  
Shanu Gupta ◽  
Justin Endo ◽  
...  

ABSTRACT Background  Minimally anchored Standard Rating Scales (SRSs), which are widely used in medical education, are hampered by suboptimal interrater reliability. Expert-derived frameworks, such as the Accreditation Council for Graduate Medical Education (ACGME) Milestones, may be helpful in defining level-specific anchors to use on rating scales. Objective  We examined validity evidence for a Milestones-Based Rating Scale (MBRS) for scoring chart-stimulated recall (CSR). Methods  Two 11-item scoring forms with either an MBRS or SRS were developed. Items and anchors for the MBRS were adapted from the ACGME Internal Medicine Milestones. Six CSR standardized videos were developed. Clinical faculty scored videos using either the MBRS or SRS and following a randomized crossover design. Reliability of the MBRS versus the SRS was compared using intraclass correlation. Results  Twenty-two faculty were recruited for instrument testing. Some participants did not complete scoring, leaving a response rate of 15 faculty (7 in the MBRS group and 8 in the SRS group). A total of 529 ratings (number of items × number of scores) using SRSs and 540 using MBRSs were available. Percent agreement was higher for MBRSs for only 2 of 11 items—use of consultants (92 versus 75, P = .019) and unique characteristics of patients (96 versus 79, P = .011)—and the overall score (89 versus 82, P &lt; .001). Interrater agreement was 0.61 for MBRSs and 0.51 for SRSs. Conclusions  Adding milestones to our rating form resulted in significant, but not substantial, improvement in intraclass correlation coefficient. Improvement was inconsistent across items.


2016 ◽  
Vol 30 (2) ◽  
pp. 99-103
Author(s):  
Kevin A. Rose ◽  
Jesika Babajanian

Objective: The objective structured practical examination (OSPE) is a case-based assessment that can be used to assess the clinical reasoning ability of students. The reliability of using an OSPE for this purpose has not been reported in the literature. The objective of this study was to determine the interrater reliability of the OSPE in measuring the clinical reasoning ability of chiropractic students. Methods: Two examiners tested each student simultaneously when enough were available as a check for interrater reliability. The scores for students over 4 exam administrations were compiled, and we calculated an intraclass correlation coefficient (ICC) using 1-way random single measures. Results: Paired scores were available for 133 students. The ICC was .685, showing a fair-to-good level of agreement for faculty in assessing the clinical reasoning ability of chiropractic students using an OSPE. Conclusion: The OSPE can be a valuable tool for testing clinical reasoning abilities because it can simulate the decision-making process that needs to be implemented in clinical practice. Faculty members at our chiropractic college were able to achieve an acceptable level of reliability in measuring the clinical reasoning abilities of students using an OSPE. Other health professional programs may consider using this tool for assessing the clinical reasoning skills of their students.


2020 ◽  
Vol 29 (2) ◽  
pp. 259-264 ◽  
Author(s):  
Hasan K. Saleh ◽  
Paula Folkeard ◽  
Ewan Macpherson ◽  
Susan Scollie

Purpose The original Connected Speech Test (CST; Cox et al., 1987) is a well-regarded and often utilized speech perception test. The aim of this study was to develop a new version of the CST using a neutral North American accent and to assess the use of this updated CST on participants with normal hearing. Method A female English speaker was recruited to read the original CST passages, which were recorded as the new CST stimuli. A study was designed to assess the newly recorded CST passages' equivalence and conduct normalization. The study included 19 Western University students (11 females and eight males) with normal hearing and with English as a first language. Results Raw scores for the 48 tested passages were converted to rationalized arcsine units, and average passage scores more than 1 rationalized arcsine unit standard deviation from the mean were excluded. The internal reliability of the 32 remaining passages was assessed, and the two-way random effects intraclass correlation was .944. Conclusion The aim of our study was to create new CST stimuli with a more general North American accent in order to minimize accent effects on the speech perception scores. The study resulted in 32 passages of equivalent difficulty for listeners with normal hearing.


1991 ◽  
Vol 34 (5) ◽  
pp. 989-999 ◽  
Author(s):  
Stephanie Shaw ◽  
Truman E. Coggins

This study examines whether observers reliably categorize selected speech production behaviors in hearing-impaired children. A group of experienced speech-language pathologists was trained to score the elicited imitations of 5 profoundly and 5 severely hearing-impaired subjects using the Phonetic Level Evaluation (Ling, 1976). Interrater reliability was calculated using intraclass correlation coefficients. Overall, the magnitude of the coefficients was found to be considerably below what would be accepted in published behavioral research. Failure to obtain acceptably high levels of reliability suggests that the Phonetic Level Evaluation may not yet be an accurate and objective speech assessment measure for hearing-impaired children.


GeroPsych ◽  
2014 ◽  
Vol 27 (1) ◽  
pp. 23-31 ◽  
Author(s):  
Anne Kuemmel (This author contributed eq ◽  
Julia Haberstroh (This author contributed ◽  
Johannes Pantel

Communication and communication behaviors in situational contexts are essential conditions for well-being and quality of life in people with dementia. Measuring methods, however, are limited. The CODEM instrument, a standardized observational communication behavior assessment tool, was developed and evaluated on the basis of the current state of research in dementia care and social-communicative behavior. Initially, interrater reliability was examined by means of videoratings (N = 10 people with dementia). Thereupon, six caregivers in six German nursing homes observed 69 residents suffering from dementia and used CODEM to rate their communication behavior. The interrater reliability of CODEM was excellent (mean κ = .79; intraclass correlation = .91). Statistical analysis indicated that CODEM had excellent internal consistency (Cronbach’s α = .95). CODEM also showed excellent convergent validity (Pearson’s R = .88) as well as discriminant validity (Pearson’s R = .63). Confirmatory factor analysis verified the two-factor solution of verbal/content aspects and nonverbal/relationship aspects. With regard to the severity of the disease, the content and relational aspects of communication exhibited different trends. CODEM proved to be a reliable, valid, and sensitive assessment tool for examining communication behavior in the field of dementia. CODEM also provides researchers a feasible examination tool for measuring effects of psychosocial intervention studies that strive to improve communication behavior and well-being in dementia.


2003 ◽  
Vol 37 (1) ◽  
pp. 40-46
Author(s):  
Rosemin Kassam ◽  
Linda G Martin ◽  
Karen B Farris ◽  
Homero A Monsanto ◽  
Jean-Marie Kaiser

Background The medication appropriateness index (MAI) has demonstrated reliability in selected outpatient clinics where medical data were easily accessible from medical charts. However, its use in the community setting where patient data may be limited has not been examined. Objective To evaluate the usefulness of a modified MAI for use in the community pharmacy setting by testing interrater reliability using 3 different rating schemes. Methods Two raters evaluated 160 medications for 32 elderly ambulatory patients. Patient information was acquired using community pharmacist-collected medication histories. A summated MAI score, percent agreement, κ, positive agreement, negative agreement, and intraclass correlation coefficient were calculated for each criterion using 3 scoring schemes. A paired samples t-test (95% CI) was used to test interrater reliability. Results The κ statistics were >0.75 for indication and effectiveness, but good (0.41–0.66) for the remaining criteria using the Hanlon scoring scheme. The intraclass coefficients (0.82, 0.86, 0.87) and overall κ (0.65, 0.66, 0.61) were similar for the 3 schemes. Conclusions This study suggests that the modified MAI has the potential to detect medication appropriateness and inappropriateness in the community pharmacy setting; however, it is not without limitations. Because the MAI has the most clinimetric and psychometric data available, the instrument should be studied further to increase its reliability and generalizability.


Sign in / Sign up

Export Citation Format

Share Document