Inter-rater reliability of two paediatric early warning score tools

Claus Sixtus Jensen

doi:10.7146/akut.v2i3.112944

Inter-rater reliability of two paediatric early warning score tools

Dansk Tidsskrift for Akutmedicin ◽

10.7146/akut.v2i3.112944 ◽

2019 ◽

Vol 2 (3) ◽

pp. 37

Author(s):

Claus Sixtus Jensen

Keyword(s):

Early Warning ◽

Intraclass Correlation ◽

Healthcare Providers ◽

Correlation Coefficients ◽

Assessment Tools ◽

Early Warning Score ◽

Rater Reliability ◽

Intraclass Correlation Coefficients ◽

Paediatric Early Warning Score ◽

The Individual

Background: Paediatric early warning score (PEWS) assessment tools can assist healthcare providers in the timely detection and recognition of subtle patient condition changes signalling clinical deterioration. However, PEWS tools instrument data are only as reliable and accurate as the caregivers who obtain and document the parameters. The aim of this study is to evaluate inter-rater reliability among nurses using PEWS systems. Method: The study was carried out in five paediatrics departments in the Central Denmark Region. Inter-rater reliability was investigated through parallel observations. A total of 108 children and 69 nurses participated. Two nurses simultaneously performed a PEWS assessment on the same patient. Before the assessment, the two participating nurses drew lots to decide who would be the active observer. Intraclass correlation coefficient, Fleiss’ κand Bland–Altman limits of agreement were used to determine inter-rater reliability. Results: The intraclass correlation coefficients for the aggregated PEWS score of the two PEWS models were 0.98 and 0.95, respectively. The κvalue on the individual PEWS measurements ranged from 0.70 to 1.0, indicating good to very good agreement. The nurses assigned the exact same aggregated score for both PEWS models in 76% of the cases. In 98% of the PEWS assessments, the aggregated PEWS scores assigned by the nurses were equal to or below 1 point in both models. Conclusion: The study showed good to very good interrater reliability in the two PEWS models used in the Central Denmark Region.

Download Full-text

Abstract 9778: Establishing Inter Rater Reliability of the National Early Warning Score (NEWS)

Circulation ◽

10.1161/circ.132.suppl_3.9778 ◽

2015 ◽

Vol 132 (suppl_3) ◽

Author(s):

Donna M Miller

Keyword(s):

Intensive Care ◽

Acute Care ◽

Early Warning ◽

Assessment Tools ◽

Early Warning Score ◽

Care Hospital ◽

Convenience Sample ◽

Rater Reliability ◽

National Early Warning Score ◽

Level Of Agreement

A change in patient condition is a dynamic process which can go unrecognized and result in a failure to rescue. Changes in patients’ vital signs can precede adverse events many hours before critical events such as cardiac arrest or emergent transfer to the Intensive Care Unit occurs. Quantitative assessment tools are used to predict risk and need for additional resources at the bedside. These tools are referred to as Early Warning Scoring Systems. The Royal College of Physicians developed a standardized tool called the National Early Warning Score (NEWS, 2012) that uses a variety of physiologic parameters to assess risk and establish a trigger threshold for summoning additional resources. Purpose: Early warning scoring tools have been found to be reliable and accurate in predicting patient decompensation. However, data from instruments are only as reliable and accurate as the caregiver who obtains and documents the parameters. The purpose of this study was to establish inter rater reliability between the RN and PCA using NEWS. Design, Sample, Setting, Procedures: This study was conducted on the clinical units of a 104 bed Long Term Acute Care Hospital (LTCH) system. These units accept patients directly from Intensive Care Units who require intense services to maintain their trajectory toward recovery. The NEWS provides a way for early detection of patient decompensation which can prevent readmission to acute care and the subsequent financial implications The convenience sample consisted of 22 RNs and 6 PCAs. Consented subjects reviewed an unfolding case study that portrayed a typical patient on the LTCH unit. Subjects were asked to circle the parameter ranges on the NEWS tool that corresponded to physiologic values in the scenario. Findings: Krippendorff’s alpha was utilized to determine the level of agreement among the raters examining the three scenarios. An alpha value of 0.94 was obtained indicating a high level of agreement among the raters. Conclusion: The NEWS can serve as a reliable adjunct to the provision of safe patient care. While it is not the sole source for determining

Download Full-text

Is the location of the signal intensity weighted centroid a reliable measurement of fluid displacement within the disc?

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2016-0178 ◽

2018 ◽

Vol 63 (4) ◽

pp. 453-460 ◽

Cited By ~ 7

Author(s):

Vahid Abdollah ◽

Eric C. Parent ◽

Michele C. Battié

Keyword(s):

Signal Intensity ◽

Water Distribution ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Region Of Interest ◽

Rater Reliability ◽

Fluid Displacement ◽

Intraclass Correlation Coefficients ◽

The Mean ◽

Standard Error Of Measurement

Abstract Degenerated discs have shorter T2-relaxation time and lower MR signal. The location of the signal-intensity-weighted-centroid reflects the water distribution within a region-of-interest (ROI). This study compared the reliability of the location of the signal-intensity-weighted-centroid to mean signal intensity and area measurements. L4-L5 and L5-S1 discs were measured on 43 mid-sagittal T2-weighted 3T MRI images in adults with back pain. One rater analysed images twice and another once, blinded to measurements. Discs were semi-automatically segmented into a whole disc, nucleus, anterior and posterior annulus. The coordinates of the signal-intensity-weighted-centroid for all regions demonstrated excellent intraclass-correlation-coefficients for intra- (0.99–1.00) and inter-rater reliability (0.97–1.00). The standard error of measurement for the Y-coordinates of the signal-intensity-weighted-centroid for all ROIs were 0 at both levels and 0 to 2.7 mm for X-coordinates. The mean signal intensity and area for the whole disc and nucleus presented excellent intra-rater reliability with intraclass-correlation-coefficients from 0.93 to 1.00, and 0.92 to 1.00 for inter-rater reliability. The mean signal intensity and area had lower reliability for annulus ROIs, with intra-rater intraclass-correlation-coefficient from 0.5 to 0.76 and inter-rater from 0.33 to 0.58. The location of the signal-intensity-weighted-centroid is a reliable biomarker for investigating the effects of disc interventions.

Download Full-text

Genetic variation in the quantitative levels of an NADP (H)-binding protein (FX) in human erythrocytes

Blood ◽

10.1182/blood.v57.2.209.209 ◽

1981 ◽

Vol 57 (2) ◽

pp. 209-217 ◽

Cited By ~ 4

Author(s):

L Lenzerini ◽

U Benatti ◽

A Morelli ◽

S Pontremoli ◽

A De Flora ◽

...

Keyword(s):

Genetic Variation ◽

Preliminary Data ◽

Binding Protein ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Red Cell ◽

Genetic Origin ◽

Intraclass Correlation Coefficients ◽

The Individual

Abstract FX is a red cell NADP(H)-binding protein that has been well defined biochemically and immunologically but whose function is still unknown. Preliminary data indicated that the levels of this protein are significantly increased in hemizygotes, heterozygotes, and homozygotes for the G6PD Mediterranean mutant, thus raising the question of whether or not the individual variation in FX levels is more or less directly influenced by X-linked genes. The present study, based on a large series of population and family data collected in Sardinia, confirms unequivocally the above mentioned interaction, but shows at the same time that the variances in FX levels “between sibships” are 2–3 times larger than those “within sibships,” when the analysis is done separately for the G6PD-normal or the G6PD-deficient sibs. From the comparison of the interclass and intraclass correlation coefficients, it appears that about 60% of the total variation of FX is of genetic origin. Moreover, the FX levels of children, analyzed in a pairwise manner, were found to be more positively correlated with those of their fathers (r = 0.39) than with those of their maternal grandfathers (0.20). This latter finding obviously favors the conclusion that “autosomal”; rather than “X-linked” genes are involved in the determination of the FX levels.

Download Full-text

Reliability and Validity of the Posture and Fine Motor Assessment of Infants

The Occupational Therapy Journal of Research ◽

10.1177/153944928900900501 ◽

1989 ◽

Vol 9 (5) ◽

pp. 259-272 ◽

Cited By ~ 3

Author(s):

Jane Case-Smith

Keyword(s):

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Fine Motor ◽

Motor Assessment ◽

Intraclass Correlation Coefficients ◽

Highly Correlated ◽

Peabody Developmental Motor Scales ◽

The Individual ◽

Test Retest Reliability

The Posture and Fine Motor Assessment of Infants (PFMAI) (Case-Smith, 1987) is a newly developed instrument for assessing the quality of motor function in infants. The test measures components of posture and fine motor control as they first develop. The purpose of this study was to support the test's reliability and validity. Interrater reliability, analyzed with intraclass correlation coefficients (ICCs), was high (.989 for total scores). Test-retest reliability, measured by ICCs, was .853 and .913 for the two test sections. The PFMAI demonstrated concurrent validity with the Peabody Developmental Motor Scales, Revised (Folio & Fewell, 1983) (correlations were .673 and .829 for the individual sections). Scores on the PFMAI were highly correlated with the infant's ages (.892 to .941); this finding provided one indication of construct validity.

Download Full-text

The reliability of the augmented Lehnert-Schroth and Rigo classification in scoliosis management

South African Journal of Physiotherapy ◽

10.4102/sajp.v77i2.1568 ◽

2021 ◽

Vol 77 (2) ◽

Author(s):

Burçin Akçay ◽

Tuğba Kuru Çolak ◽

Adnan Apti ◽

İlker Çolak ◽

Önder Kızıltaş

Keyword(s):

Treatment Plan ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Error Measurement ◽

X Rays ◽

X Ray ◽

Intraclass Correlation Coefficients ◽

Observer Reliability ◽

The Individual ◽

Curve Patterns

Background: In pattern-specific scoliosis exercises and bracing, the corrective treatment plan differs according to different curve patterns. There are a limited number of studies investigating the reliability of the commonly used classifications systems.Objective: To test the reliability of the augmented Lehnert-Schroth (ALS) classification and the Rigo classification.Methods: X-rays and posterior photographs of 45 patients with scoliosis were sent by the first author to three clinicians twice at 1-week intervals. The clinicians classified images according to the ALS and Rigo classifications, and the data were analysed using SPSS V-16. Intraclass correlation coefficients (ICCs) and standard error measurement (SEM) were calculated to evaluate the inter- and intra-observer reliability.Results: The inter-observer ICC values were 0.552 (ALS), 0.452 (Rigo) for X-ray images and 0.494 (ALS), 0.518 (Rigo) for the photographs. The average intra-observer ICC value was 0.720 (ALS), 0.581 (Rigo) for the X-ray images and 0.726 (ALS) and 0.467 (Rigo) for the photographs.Conclusions: The results of our study indicate moderate inter-observer reliability for X-ray images using the ALS classification and clinical photographs using the Rigo classification. Intra-observer reliability was moderate to good for X-ray images and clinical photographs using the ALS classification and poor to moderate for X-ray and clinical photographs using the Rigo classification.Clinical implications: Pattern classifications assist in creating a plan and indication of correction in specific scoliosis physiotherapy and pattern-specific brace applications and surgical treatment. More sub-types are needed to address the individual patterns of curvature. The optimisation of curve classification will likely reduce failures in diagnosis and treatment.

Download Full-text

Reliability and validity of the iSense optical scanner for measuring volume of transtibial residual limb models

Prosthetics and Orthotics International ◽

10.1177/0309364618806038 ◽

2018 ◽

Vol 43 (2) ◽

pp. 213-220 ◽

Cited By ~ 1

Author(s):

Lucy Armitage ◽

Li Khim Kwah ◽

Lauren Kark

Keyword(s):

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Criterion Validity ◽

Residual Limb ◽

Limb Volume ◽

Rater Reliability ◽

Measuring Volume ◽

Intraclass Correlation Coefficients ◽

Optical Scanner

Background: Residual limb volume is often measured as part of routine care for people with amputations. These measurements assist in the timing of prosthetic fitting or replacement. In order to make well informed decisions, clinicians need access to measurement tools that are valid and reliable. Objectives: To assess the reliability and criterion validity of the iSense optical scanner in measuring volume of transtibial residual limb models. Study Design: Three assessors performed two measurements each on 13 residual limb models with an iSense optical scanner (3D systems, USA). Intra-rater and inter-rater reliability were calculated using intraclass correlation coefficients. Bland Altman plots were inspected for agreement. Criterion validity was assessed using a steel rod of known dimensions. Ten repeated measurements were performed by one assessor. A t-test was used to determine differences between measured and true rod volume. Results: Intra-rater reliability was excellent (range of intraclass correlation coefficients: 0.991–0.997, all with narrow 95% confidence intervals). While the intraclass correlation coefficients suggest excellent inter-rater reliability between all three assessors (range of intraclass correlation coefficients: 0.952–0.986), the 95% confidence intervals were wide between assessor 3 and the other two assessors. Poor agreement with assessor 3 was also seen in the Bland-Altman plots. Criterion validity was very poor with a significant difference between the mean iSense measurement and the true rod volume (difference: 221.18 mL; p < 0.001). Conclusions: Although intra-rater reliability was excellent for the iSense scanner, we did not find similar results for inter-rater reliability and validity. These results suggest that further testing of the iSense scanner is required prior to use in clinical practice. Clinical relevance The iSense offers a low cost scanning option for residual limb volume measurement. Intra-rater reliability was excellent, but inter-rater reliability and validity were such that clinical adoption is not indicated at present.

Download Full-text

Inter-Rater Reliability: Intraclass Correlation Coefficients

Educational and Psychological Measurement ◽

10.1177/001316448104100127 ◽

1981 ◽

Vol 41 (1) ◽

pp. 223-226 ◽

Cited By ~ 6

Author(s):

Dong Won Cho

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Rater Reliability ◽

Intraclass Correlation Coefficients

Download Full-text

Reliability of Autism-Tics, AD/HD, and other Comorbidities (A–TAC) Inventory in a Test-Retest Design

Psychological Reports ◽

10.2466/03.15.pr0.114k10w1 ◽

2014 ◽

Vol 114 (1) ◽

pp. 93-103 ◽

Cited By ~ 15

Author(s):

Tomas Larson ◽

Eva Norén Selinus ◽

Clara Hellner Gumpert ◽

Thomas Nilsson ◽

Nóra Kerekes ◽

...

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Population Based ◽

Autism Spectrum ◽

Good Test ◽

Rater Reliability ◽

Retest Reliability ◽

Intraclass Correlation Coefficients ◽

Intraclass Correlations ◽

Test Retest Reliability

The Autism-Tics, AD/HD, and other Comorbidities (A–TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A–TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A–TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's κ. A–TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A–TAC had intra- and inter-rater reliability intraclass correlation coefficients of ≥ .60. Cohen's κ indicated acceptable reliability. The current study provides statistical evidence that the A–TAC yields good test-retest reliability in a population-based cohort of children.

Download Full-text

Reliability of assessment of medical students’ non-technical skills using a behavioural marker system: does clinical experience matter?

BMJ Simulation and Technology Enhanced Learning ◽

10.1136/bmjstel-2020-000705 ◽

2020 ◽

pp. bmjstel-2020-000705

Author(s):

Benjamin Clarke ◽

Samantha E Smith ◽

Emma Claire Phillips ◽

Ailsa Hamilton ◽

Joanne Kerins ◽

...

Keyword(s):

Medical Students ◽

Clinical Experience ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Technical Skills ◽

Rater Reliability ◽

Single Measure ◽

Marker System ◽

Intraclass Correlation Coefficients ◽

Reliability Coefficients

IntroductionNon-technical skills are recognised to play an integral part in safe and effective patient care. Medi-StuNTS (Medical Students’ Non-Technical Skills) is a behavioural marker system developed to enable assessment of medical students’ non-technical skills. This study aimed to assess whether newly trained raters with high levels of clinical experience could achieve reliability coefficients of >0.7 and to compare differences in inter-rater reliability of raters with varying clinical experience.MethodsForty-four raters attended a workshop on Medi-StuNTS before independently rating three videos of medical students participating in immersive simulation scenarios. Data were grouped by raters’ levels of clinical experience. Inter-rater reliability was assessed by calculating intraclass correlation coefficients (ICC).ResultsEleven raters with more than 10 years of clinical experience achieved single-measure ICC of 0.37 and average-measures ICC of 0.87. Fourteen raters with more than or equal to 5 years and less than 10 years of clinical experience achieved single-measure ICC of 0.09 and average-measures ICC of 0.59. Nineteen raters with less than 5 years of clinical experience achieved single-measure ICC of 0.09 and average-measures ICC 0.65.ConclusionsUsing 11 newly trained raters with high levels of clinical experience produced highly reliable ratings that surpassed the prespecified inter-rater reliability standard; however, a single rater from this group would not achieve sufficiently reliable ratings. This is consistent with previous studies using other medical behavioural marker systems. This study demonstrated a decrease in inter-rater reliability of raters with lower levels of clinical experience, suggesting caution when using this population as raters for assessment of non-technical skills.

Download Full-text

Vein Measurement by Peripherally Inserted Central Catheter Nurses Using Ultrasound: A Reliability Study

Journal of the Association for Vascular Access ◽

10.1016/j.java.2013.08.001 ◽

2013 ◽

Vol 18 (4) ◽

pp. 234-238 ◽

Cited By ~ 8

Author(s):

Rebecca Sharp ◽

Andrea Gordon ◽

Antonina Mikocka-Walus ◽

Jessie Childs ◽

Carol Grech ◽

...

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Cephalic Vein ◽

Basilic Vein ◽

Rater Reliability ◽

Vein Thrombosis ◽

Intraclass Correlation Coefficients ◽

Measurement Protocol ◽

Brachial Vein ◽

Deep Vein

Abstract Background: Peripherally inserted central catheters (PICCs) are increasingly inserted by trained registered nurses, necessitating the development of specialized skills such as the use of ultrasound. The selection of an adequately sized vein is an important factor in reducing adverse events such as deep vein thrombosis. However, PICC nurses may receive minimal training in the use of ultrasound for vein measurement. Objective: We aimed to demonstrate the reliability of a vein measurement protocol using ultrasound by a PICC nurse trained in sonography. Methods: The diameter of the basilic, brachial, and cephalic veins in the left arms of healthy participants (n =12) were measured using ultrasound by a PICC nurse and a sonographer. A PICC nurse performed the measurement twice and the sonographer once; the PICC nurse's results were compared for intra-rater reliability and compared with the sonographer for inter-rater reliability. The results were analyzed using intraclass correlation coefficients (ICCs). Results: Inter-rater reliability between the PICC nurse and the sonographer was adequate, the ICC for the brachial vein was 0.60 (95% confidence interval [CI], 0.06–0.87), basilic vein ICC was 0.87 (95% CI, 0.58–0.96) and cephalic vein ICC was 0.77 (95% CI, 0.39–0.93). Intra-rater reliability of the PICC nurse was higher; the ICC for the brachial vein was 0.80 (95% CI, 0.44–0.94), basilic vein ICC was 0.92 (95% CI, 0.67–0.98), and cephalic vein ICC was 0.78 (95% CI, 0.40–0.93). Conclusions: Using a suitable protocol, a PICC nurse was able to measure vein diameter reliably when compared with a sonographer and consistently replicate these results.

Download Full-text