scholarly journals Approaches to describing inter-rater reliability of the overall clinical appearance of febrile infants and toddlers in the Emergency Department.

Author(s):  
Paul Walsh ◽  
Justin M. Thornton ◽  
Nicholas Walker ◽  
John Gary McCoy ◽  
Joe Baal ◽  
...  

Objectives To measure inter-rater agreement of overall clinical appearance of febrile children aged less than 24 months and to compare methods for doing so. Study Design and setting We performed an observational study of inter-rater reliability of the assessment of febrile children in a county hospital emergency department serving a mixed urban and rural population. Two emergency medicine healthcare providers independently evaluated the overall clinical appearance of children less than 24 months of age who had presented for fever. They recorded the initial ‘gestalt’ assessment of whether or not the child was ill appearing or if they were unsure. They then repeated this assessment after examining the child. Each rater was blinded to the other’s assessment. Our primary analysis was graphical. We also calculated Cohen’s κ, Gwet’s agreement coefficient and other measures of agreement and weighted variants of these. We examined the effect of time between exams and patient and provider characteristics on inter-rater agreement. Results We analyzed 159 of the 173 patients enrolled. Median age was 9.5 months (lower and upper quartiles 4.9-14.6), 99/159 (62%) were boys and 22/159 (14%) were admitted. Overall 118/159 (74%) and 119/159 (75%) were classified as well appearing on initial ‘gestalt’ impression by both examiners. Summary statistics varied from 0.223 for weighted κ to 0.635 for Gwet’s AC2. Inter rater agreement was affected by the time interval between the evaluations and the age of the child but not by the experience levels of the rater pairs. Classifications of ‘not ill appearing’ were more reliable than others. Conclusion The inter-rater reliability of emergency providers' assessment of overall clinical appearance was adequate when described graphically and by Gwet’s AC. Different summary statistics yield different results for the same dataset.

2014 ◽  
Author(s):  
Paul Walsh ◽  
Justin M. Thornton ◽  
Nicholas Walker ◽  
John Gary McCoy ◽  
Joe Baal ◽  
...  

Objectives To measure inter-rater agreement of overall clinical appearance of febrile children aged less than 24 months and to compare methods for doing so. Study Design and setting We performed an observational study of inter-rater reliability of the assessment of febrile children in a county hospital emergency department serving a mixed urban and rural population. Two emergency medicine healthcare providers independently evaluated the overall clinical appearance of children less than 24 months of age who had presented for fever. They recorded the initial ‘gestalt’ assessment of whether or not the child was ill appearing or if they were unsure. They then repeated this assessment after examining the child. Each rater was blinded to the other’s assessment. Our primary analysis was graphical. We also calculated Cohen’s κ, Gwet’s agreement coefficient and other measures of agreement and weighted variants of these. We examined the effect of time between exams and patient and provider characteristics on inter-rater agreement. Results We analyzed 159 of the 173 patients enrolled. Median age was 9.5 months (lower and upper quartiles 4.9-14.6), 99/159 (62%) were boys and 22/159 (14%) were admitted. Overall 118/159 (74%) and 119/159 (75%) were classified as well appearing on initial ‘gestalt’ impression by both examiners. Summary statistics varied from 0.223 for weighted κ to 0.635 for Gwet’s AC2. Inter rater agreement was affected by the time interval between the evaluations and the age of the child but not by the experience levels of the rater pairs. Classifications of ‘not ill appearing’ were more reliable than others. Conclusion The inter-rater reliability of emergency providers' assessment of overall clinical appearance was adequate when described graphically and by Gwet’s AC. Different summary statistics yield different results for the same dataset.


PeerJ ◽  
2014 ◽  
Vol 2 ◽  
pp. e651 ◽  
Author(s):  
Paul Walsh ◽  
Justin Thornton ◽  
Julie Asato ◽  
Nicholas Walker ◽  
Gary McCoy ◽  
...  

2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Sofi Varg ◽  
Veronica Vicente ◽  
Maaret Castren ◽  
Peter Lindgren ◽  
Clas Rehnberg

Abstract Background A decision system in the ambulance allowing alternative pathways to alternate healthcare providers has been developed for older patients in Stockholm, Sweden. However, subsequent healthcare resource use resulting from these pathways has not yet been addressed. The aim of this study was therefore to describe patient pathways, healthcare utilisation and costs following ambulance transportation to alternative healthcare providers. Methods The design of this study was descriptive and observational. Data from a previous RCT, where a decision system in the ambulance enabled alternative healthcare pathways to alternate healthcare providers were linked to register data. The receiving providers were: primary acute care centre or secondary geriatric ward, both located at the same community hospital, or the conventional pathway to the emergency department at an acute hospital. Resource use over 10 days, subsequent to assessment with the decision system, was mapped in terms of healthcare pathways, utilisation and costs for the 98 included cases. Results Almost 90% were transported to the acute care centre or geriatric ward. The vast majority arriving to the geriatric ward stayed there until the end of follow-up or until discharged, whereas patients conveyed to the acute care centre to a large extent were admitted to hospital. The median patient had 6 hospital days, 2 outpatient visits and costed roughly 4000 euros over the 10-day period. Arrival destination geriatric ward indicated the longest hospital stay and the emergency department the shortest. However, the cost for the 10-day period was lower for cases arriving to the geriatric ward than for those arriving to the emergency department. Conclusions The findings support the appropriateness of admittance directly to secondary geriatric care for older adults. However, patients conveyed to the acute care centre ought to be studied in more detail with regards to appropriate level of care.


Medicine ◽  
2019 ◽  
Vol 98 (6) ◽  
pp. e14250 ◽  
Author(s):  
Hongjung Kim ◽  
Juncheol Lee ◽  
Sanghyun Lee ◽  
Jaehoon Oh ◽  
Boseung Kang ◽  
...  

CJEM ◽  
2016 ◽  
Vol 18 (S1) ◽  
pp. S39-S39 ◽  
Author(s):  
B. Borgundvaag ◽  
S.L. McLeod ◽  
T.E. Dear ◽  
S.M. Carver ◽  
N. Norouzi ◽  
...  

Introduction: Ideal management of alcohol withdrawal syndrome (AWS) incorporates a symptom driven approach, whereby patients are regularly assessed using a standardized scoring system (Clinical Institute Withdrawal Assessment for Alcohol-Revised; CIWA-Ar) and treated according to severity. Among the domains assessed by the CIWA-Ar, tremor is the most objective indicator of withdrawal severity, however, the ability of clinicians to reliably quantify tremor is highly dependent on experience. The objective of this study was to prospectively validate an objective, reliable tool to standardize and quantify the severity of alcohol withdrawal tremor using the built-in accelerometer of an iOS application. Methods: A prospective observational study of patients ≥18 years presenting to an academic emergency department in alcohol withdrawal was conducted from Oct 2014 to Aug 2015. Assessments were videotaped by a research assistant and subsequently reviewed by 3 clinical experts, blinded to the primary clinical assessment. Tremor severity was scored using the 8-point CIWA scale (0=no tremor, 7=severe tremor). Accelerometer derived results were compared to expert assessments of each video. Inter-rater agreement was estimated using Cohen’s kappa (k) statistic. Results: 76 patients with 78 tremor recordings were included. Accelerometer derived tremor scores matched exactly with expert assessor scores in 36 (46.2%) cases, within 1 point for 73 (93.6%) cases and differed by ≥ 2 points in 5 (6.4%) cases. The overall kappa for agreement within 1 point for tremor severity was ‘very good’ 0.92 (95% CI: 0.86, 0.99). Conclusion: iOS accelerometer based assessment of the tremor component of the CIWA-Ar score is reliable and has potential to more accurately assess the severity of patients in alcohol withdrawal. We anticipate this resource will be easily disseminated and will impact and improve the care of patients with alcohol withdrawal.


2017 ◽  
Vol 181 (24) ◽  
pp. 655-655 ◽  
Author(s):  
Rafael Alzola Domingo ◽  
Chris M Riggs ◽  
David S Gardner ◽  
Sarah L Freeman

Superficial digital flexor tendon (SDFT) tendinopathy is an important musculoskeletal problem in horses. The study objective was to validate an ultrasonographic scoring system for SDFT injuries. Ultrasonographic images from 14 Thoroughbred racehorses with SDFT lesions (seven core; seven diffuse) and two controls were blindly assessed by five clinicians on two occasions. Ultrasonographic parameters evaluated were: type and extent of the injury, location, echogenicity, cross-sectional area and longitudinal fibre pattern of the maximal injury zone (MIZ). Inter-rater variability and intra-rater reliability were assessed using Kendall’s coefficient of concordance (KC) and Lin’s concordance correlation coefficient (LC), respectively. Type of injury (core vs. diffuse) had perfect inter/intra-rater agreement. Cases with core lesions had very strong inter-rater agreement (KC ≥0.74, P<0.001) and intra-rater reliability (LC ≥0.73) for all parameters apart from echogenicity. Cases with diffuse lesions had strong inter-rater agreement (KC ≥0.62) for all parameters, but weak agreement for echogenicity (KC=0.22); intra-rater reliability was excellent for MIZ location and fibre pattern (LC ≥0.82), and moderate (LC ≥0.58) for cross-sectional area and number of zones affected. This scoring system was reliable and repeatable for all parameters, except for echogenicity. A validated scoring system will facilitate reliable recording of SDFT injuries and inter-study meta-analyses.


Author(s):  
Ramesh Srivathsavai ◽  
Nicole Genco ◽  
Katja Ho¨ltta¨-Otto ◽  
Carolyn C. Seepersad

In recent years, many new idea generation methods have been developed to generate innovative concepts. The effectiveness of those methods is evaluated by applying a set of metrics to the resulting concepts. Several metrics have been proposed for this purpose, including quality, novelty, and variety metrics, but the inter-rater reliability of those metrics has not been investigated extensively. In this paper, the inter-rater reliability of three existing metrics is analyzed by applying them to the results of a representative idea generation study. The effects on inter-rater agreement of analyzing concepts at the overall concept level versus the feature level are investigated, along with the impacts of alternative scales for specific metrics. In general, the inter-rater reliability of the metrics is found to be relatively low, with the most reliable results obtained at the feature level. The use of different scales also affects inter-rater reliability, but the effect is less significant. In addition to their low levels of repeatability, the metrics differ in how novelty is appraised.


Sign in / Sign up

Export Citation Format

Share Document