Agreement of an Evaluation of the Forward-Step-Down Test by a Broad Cohort of Clinicians With That of an Expert Panel

2016 ◽  
Vol 25 (3) ◽  
pp. 227-232 ◽  
Author(s):  
Gidon Herman ◽  
Oren Nakdimon ◽  
Pazit Levinger ◽  
Shmuel Springer

Context:The forward-step-down (FSD) test may be used to identify underlying pathologies related to lower-extremity injuries. However, research on its interrater reliability is limited.Objective:To assess the interrater reliability of the FSD test with a broad cohort of clinicians and to compare the level of agreement with an expert panel.Design:Single-measure, interrater reliability.Setting:Annual conference of the Israeli Physical Therapy Society.Participants:15 healthy subjects who performed the FSD test and 142 physical therapists (PTs) who evaluated performance.Methods:Each subject performed the FSD while being videoed. Six videos were selected by an expert panel for analysis. After viewing the videos, FSD performance was rated by 142 PTs, as well as by the expert panel, using a 3-level scale.Main Outcome Measures:Interrater reliability determined by intraclass correlation coefficient (ICC) and percentage of agreement with the expert panel.Results:Fair to good reliability and acceptable agreement were found for the entire sample of raters (ICC –.61, agreement 74%). The percentage of agreement was greater in the subgroup of raters who were familiar with the FSD than in those who were not (78.08% vs 69.32%, respectively, P = .004). Years of work experience did not affect the percentage of agreement (P = .141).Conclusions:Fair to good interrater reliably of the FSD test was demonstrated by a broad cohort of PTs. The findings support the clinical utility of the FSD test as an assessment tool for quality of movement.

2016 ◽  
Vol 25 (4) ◽  
pp. 330-337 ◽  
Author(s):  
Brett Aefsky ◽  
Niles Fleet ◽  
Heather Myers ◽  
Robert J. Butler

Context:Currently, hip-rotation range of motion (ROM) is clinically measured in an open kinetic chain in either seated or prone position using passive or active ROM. However, during activities of daily living and during sports participation the hip must be able to rotate in a loaded position, and there is no standard measurement for this.Objective:To determine if a novel method for measuring hip rotation in weight bearing will result in good to very good reliability as demonstrated by an intraclass correlation coefficient (ICC) of >.80 and to investigate if weight-bearing hip measurements will result in significantly reduced hip ROM compared with non-weight-bearing methods.Design:Repeated measures.Setting:Outpatient sports physical therapy clinic.Participants:20 healthy participants (10 men, 10 women) recruited for hip-rotation measurements.Methods:Three trials of both internal and external rotation were measured in sitting, prone, and weight bearing. Two therapists independently measured each participant on the same day. The participants returned the following day to repeat the same measurements with the same 2 therapists.Main Outcome Measures:Degrees of hip internal and external rotation measured in prone, sitting, and loaded positions.Results:In general, the measurement of hip ROM across the different conditions was reliable. The intrarater reliability was .67–.95, while interrater reliability was .59–.96. Interrater reliability was improved when values were averaged across the measures (.75–.97). ICCs for active loaded ROM were .67–.81, while interrater ICCs were .53–.87. In general, prone hip ROM was greater than supine and supine was greater than loaded.Conclusions:Loaded hip rotation can be measured in a clinical setting with moderate to good reliability. The rotation ROM of a loaded hip can be significantly decreased compared with unloaded motion.


GeroPsych ◽  
2014 ◽  
Vol 27 (1) ◽  
pp. 23-31 ◽  
Author(s):  
Anne Kuemmel (This author contributed eq ◽  
Julia Haberstroh (This author contributed ◽  
Johannes Pantel

Communication and communication behaviors in situational contexts are essential conditions for well-being and quality of life in people with dementia. Measuring methods, however, are limited. The CODEM instrument, a standardized observational communication behavior assessment tool, was developed and evaluated on the basis of the current state of research in dementia care and social-communicative behavior. Initially, interrater reliability was examined by means of videoratings (N = 10 people with dementia). Thereupon, six caregivers in six German nursing homes observed 69 residents suffering from dementia and used CODEM to rate their communication behavior. The interrater reliability of CODEM was excellent (mean κ = .79; intraclass correlation = .91). Statistical analysis indicated that CODEM had excellent internal consistency (Cronbach’s α = .95). CODEM also showed excellent convergent validity (Pearson’s R = .88) as well as discriminant validity (Pearson’s R = .63). Confirmatory factor analysis verified the two-factor solution of verbal/content aspects and nonverbal/relationship aspects. With regard to the severity of the disease, the content and relational aspects of communication exhibited different trends. CODEM proved to be a reliable, valid, and sensitive assessment tool for examining communication behavior in the field of dementia. CODEM also provides researchers a feasible examination tool for measuring effects of psychosocial intervention studies that strive to improve communication behavior and well-being in dementia.


RMD Open ◽  
2019 ◽  
Vol 5 (2) ◽  
pp. e001057 ◽  
Author(s):  
Simon Krabbe ◽  
Mikkel Østergaard ◽  
Susanne J Pedersen ◽  
Ulrich Weber ◽  
Georg Kröber ◽  
...  

ObjectiveTo validate the Canada-Denmark (CANDEN) MRI scoring system for the spine in axial spondyloarthritis with updated lesion definitions.MethodsLesion definitions in the CANDEN system were updated and illustrated by a consensus set of reference images. Sagittal spine MRIs of 40 patients with axial spondyloarthritis obtained at baseline and at week 52 after initiation of treatment with the tumour necrosis factor inhibitor golimumab were evaluated in unknown chronology by seven readers blinded to all other data.ResultsCANDEN MRI spine inflammation score had very good reliability for status scores (single-measure intraclass correlation coefficient (ICC) of 21 reader pairs median of 0.91 (IQR 0.88–0.92)) and change scores (ICC 0.88 (0.86–0.92)). CANDEN MRI spine fat score had good to very good reliability for status scores (ICC 0.79 (0.75–0.86)) and moderate to good reliability for detecting change (ICC 0.59 (0.46–0.73)). CANDEN MRI spine bone erosion score and CANDEN MRI spine new bone formation score had slight to moderate reliability for status scores (ICC 0.38 (0.32–0.52) and 0.39 (0.27–0.49), respectively).ConclusionThe CANDEN MRI spine scoring system allows a comprehensive evaluation of inflammation, fat, bone erosion and new bone formation of the spine in patients with axial spondyloarthritis. It demonstrated very good reliability for detecting change in inflammation, moderate to good reliability for detecting change in fat, and slight to moderate reliability for detecting bone erosions and new bone formation. Studies with longer follow-up or patients with more advanced spinal involvement may be needed to reliably detect change in bone erosion and new bone formation scores.Trial registration numberNCT02011386.


2013 ◽  
Vol 48 (3) ◽  
pp. 331-336 ◽  
Author(s):  
Rebecca Shultz ◽  
Scott C. Anderson ◽  
Gordon O. Matheson ◽  
Brandon Marcello ◽  
Thor Besier

Context: The Functional Movement Screen (FMS) is a popular test to evaluate the degree of painful, dysfunctional, and asymmetric movement patterns. Despite great interest in the FMS, test-retest reliability data have not been published. Objective: To assess the test-retest and interrater reliability of the FMS and to compare the scoring by 1 rater during a live session and the same session on video. Design: Cross-sectional study. Setting: Human performance laboratory in the sports medicine center. Patients or Other Participants: A total of 21 female (age = 19.6 ± 1.5 years, height = 1.7 ± 0.1 m, mass = 64.4 ± 5.1 kg) and 18 male (age = 19.7 ± 1.0 years, height = 1.9 ± 0.1 m, mass = 80.1 ± 9.9 kg) National Collegiate Athletic Association Division IA varsity athletes volunteered. Intervention(s): Each athlete was tested and retested 1 week later by the same rater who also scored the athlete's first session from a video recording. Five other raters scored the video from the first session. Main Outcome Measure(s): The Krippendorff α (K α) was used to assess the interrater reliability, whereas intraclass correlation coefficients (ICCs) were used to assess the test-retest reliability and reliability of live-versus-video scoring. Results: Good reliability was found for the test-retest (ICC = 0.6), and excellent reliability was found for the live-versus-video sessions (ICC = 0.92). Poor reliability was found for the interrater reliability (K α = .38). Conclusions: The good test-retest and high live-versus-video session reliability show that the FMS is a usable tool within 1 rater. However, the low interrater K α values suggest that the FMS within the limits of generalization should not be used indiscriminately to detect deficiencies that place the athlete at greater risk for injury. The FMS interrater reliability may be improved with better training for the rater.


2020 ◽  
pp. 1-13
Author(s):  
Louise Capling ◽  
Janelle A. Gifford ◽  
Kathryn L. Beck ◽  
Victoria M. Flood ◽  
Fiona Halar ◽  
...  

Abstract Diet quality indices are a practical, cost-effective method to evaluate dietary patterns, yet few have investigated diet quality in athletes. This study describes the relative validity and reliability of the recently developed Athlete Diet Index (ADI). Participants completed the electronic ADI on two occasions, 2 weeks apart, followed by a 4-d estimated food record (4-dFR). Relative validity was evaluated by directly comparing mean scores of the two administrations (mAdm) against scores derived from 4-dFR using Spearman’s rank correlation coefficient and Bland–Altman (B–A) plots. Construct validity was investigated by comparing mAdm scores and 4-dFR-derived nutrient intakes using Spearman’s coefficient and independent t test. Test–retest reliability was assessed using paired t test, intraclass correlation coefficients (ICC) and B–A plots. Sixty-eight elite athletes (18·8 (sd 4·2) years) from an Australian sporting institute completed the ADI on both occasions. Mean score was 84·1 (sd 15·2; range 42·5–114·0). The ADI had good reliability (ICC = 0·80, 95 % CI 0·69, 0·87; P < 0·001), and B–A plots (mean 1·9; level of agreement −17·8, 21·7) showed no indication of systematic bias (y = 4·57–0·03 × x) (95 % CI −0·2, 0·1; P = 0·70). Relative validity was evaluated in fifty athletes who completed all study phases. Comparison of mAdm scores with 4-dFR-derived scores was moderate (rs 0·69; P < 0·001) with no systematic bias between methods of measurement (y = 6·90–0·04 × x) (95 % CI −0·3, 0·2; P = 0·73). Higher scores were associated with higher absolute nutrient intake consistent with a healthy dietary pattern. The ADI is a reliable tool with moderate validity, demonstrating its potential for application to investigate the diet quality of athletes.


2012 ◽  
Vol 92 (6) ◽  
pp. 841-852 ◽  
Author(s):  
Alexandra De Kegel ◽  
Tina Baetens ◽  
Wim Peersman ◽  
Leen Maes ◽  
Ingeborg Dhooge ◽  
...  

Background Balance is a fundamental component of movement. Early identification of balance problems is important to plan early intervention. The Ghent Developmental Balance Test (GDBT) is a new assessment tool designed to monitor balance from the initiation of independent walking to 5 years of age. Objective The purpose of this study was to establish the psychometric characteristics of the GDBT. Methods To evaluate test-retest reliability, 144 children were tested twice on the GDBT by the same examiner, and to evaluate interrater reliability, videotaped GDBT sessions of 22 children were rated by 3 different raters. To evaluate the known-group validity of GDBT scores, z scores on the GDBT were compared between a clinical group (n=20) and a matched control group (n=20). Concurrent validity of GDBT scores with the subscale standardized scores of the Movement Assessment Battery for Children–Second Edition (M-ABC-2), the Peabody Developmental Motor Scales–Second Edition (PDMS-2), and the balance subscale of the Bruininks-Oseretsky Test–Second Edition (BOT-2) was evaluated in a combined group of the 20 children from the clinical group and 74 children who were developing typically. Results Test-retest and interrater reliability were excellent for the GDBT total scores, with intraclass correlation coefficients of .99 and .98, standard error of measurement values of 0.21 and 0.78, and small minimal detectable differences of 0.58 and 2.08, respectively. The GDBT was able to distinguish between the clinical group and the control group (t38=5.456, P&lt;.001). Pearson correlations between the z scores on GDBT and the standardized scores of specific balance subscales of the M-ABC-2, PDMS-2, and BOT-2 were moderate to high, whereas correlations with subscales measuring constructs other than balance were low. Conclusions The GDBT is a reliable and valid clinical assessment tool for the evaluation of balance in toddlers and preschool-aged children.


2016 ◽  
Vol 25 (4) ◽  
pp. 371-379 ◽  
Author(s):  
Robert H. Wellmon ◽  
Dawn T. Gulick ◽  
Mark L. Paterson ◽  
Colleen N. Gulick

Context:Smartphones are being used in a variety of practice settings to measure joint range of motion (ROM). A number of factors can affect the validity of the measurements generated. However, there are no studies examining smartphone-based goniometer applications focusing on measurement variability and error arising from the electromechanical properties of the device being used.Objective:To examine the concurrent validity and interrater reliability of 2 goniometric mobile applications (Goniometer Records, Goniometer Pro), an inclinometer, and a universal goniometer (UG).Design:Nonexperimental, descriptive validation study.Setting:University laboratory.Participants:3 physical therapists having an average of 25 y of experience.Main Outcome Measures:Three standardized angles (acute, right, obtuse) were constructed to replicate the movement of a hinge joint in the human body. Angular changes were measured and compared across 3 raters who used 3 different devices (UG, inclinometer, and 2 goniometric apps installed on 3 different smartphones: Apple iPhone 5, LG Android, and Samsung SIII Android). Intraclass correlation coefficients (ICCs) and Bland-Altman plots were used to examine interrater reliability and concurrent validity.Results:Interrater reliability for each of the smartphone apps, inclinometer and UG were excellent (ICC = .995–1.000). Concurrent validity was also good (ICC = .998–.999). Based on the Bland-Altman plots, the means of the differences between the devices were low (range = –0.4° to 1.2°).Conclusions:This study identifies the error inherent in measurement that is independent of patient factors and due to the smartphone, the installed apps, and examiner skill. Less than 2° of measurement variability was attributable to those factors alone. The data suggest that 3 smartphones with the 2 installed apps are a viable substitute for using a UG or an inclinometer when measuring angular changes that typically occur when examining ROM and demonstrate the capacity of multiple examiners to accurately use smartphone-based goniometers.


2020 ◽  
Vol 8 (4_suppl3) ◽  
pp. 2325967120S0018
Author(s):  
Andrea Stracciolini ◽  
Laura Boucher ◽  
Sarah Jackson ◽  
Naomi Brown ◽  
Danielle Magrini ◽  
...  

Background The medial patellofemoral ligament (MPFL) is an important soft tissue constraint to preventing patellar dislocations in young athletes. The anatomy of the MPFL has been investigated in cadaveric studies and magnetic resonance studies. No studies to date have provided anatomical data of the MPFL on ultrasonography. Purpose To investigate the feasibility of musculoskeletal ultrasonography for the evaluation of the MPFL, and to determine interrater and intrarater reliability for MPFL ultrasound measures. Methods Ten control participants (20 knees) 20 to 50 years underwent ultrasonography performed by 3 researchers (musculoskeletal ultrasound radiologist, athletic trainer/biomechanist, primary care sports medicine physician) from 3 different institutions for interrater reliability testing. Intrarater reliability testing was performed at 2 separate institutions by 4 physicians, each performing the same knee ultrasound protocol on 20 knees in 10 study participants 2 to 3 weeks apart. In total, 180 images were created for interrater reliability, and 480 images for intrarater reliability. Examinations were performed with linear high-frequency transducers (10-18 MHz) with the participant in the supine position and the extremity flexed at 45°. Measurements included ligament length (long axis to ligament) from the patellar to the femoral attachment sites, ligament width (short axis to ligament) at the patellar attachment, and ligament thickness (long axis to ligament) midway between the patella and femur. Mean and SD were calculated for all measurements. Intraclass correlation coefficient (ICC) analysis was used to assess intrarater and interrater reliability. ICC values < 0.40 indicated poor reliability, whereas those between 0.40 and 0.75 indicated fair to good reliability, and those > 0.75 indicated excellent reliability. Results The mean US value for MPFL length was 44.83mm (SD 6.68), mean thickness 2.66mm (SD 0.85), and mean width 11.76mm (SD 2.99). The overall ICC values for interrater reliability testing indicated fair to good reliability for length measures (0.7) and poor reliability for thickness (–0.1) and width (0.3; Table 1.1). Overall ICC values for intrarater reliability indicated fair to good reliability for length (0.5), excellent for thickness (0.9), and poor reliability for width (–0.3; Table 1.2). Conclusions Musculoskeletal ultrasonography is a feasible and reliable office-based method of measuring MPFL length and thickness. These quantitative measures set the groundwork for establishing normative anatomical measures of the MPFL in athletes and establish a protocol for testing and measuring the MPFL using musculoskeletal ultrasonography. [Table: see text][Table: see text]


2011 ◽  
Vol 20 (4) ◽  
pp. 393-405 ◽  
Author(s):  
Christopher Melton ◽  
David R. Mullineaux ◽  
Carl G. Mattacola ◽  
Scott D. Mair ◽  
Tim L. Uhl

Context:Dynamic shoulder motion can be captured using video capture systems, but reliability has not yet been established.Objective:To compare the reliability of 2 systems in measuring dynamic shoulder kinematics during forward-elevation movements and to determine differences in these kinematics between healthy and injured subjects.Design:Reliability and cohort.Setting:Research laboratory.Participants:11 healthy subjects and 10 post–superior labrum anteroposterior lesion patients (SLAP).Intervention:Contrasting markers were placed at the hip, elbow, and shoulder to represent shoulder elevation and were videotaped in 2 dimensions. Subjects performed 6 repetitions of active elevation (AE) and active assisted elevation of the shoulder, and 3 trials were analyzed using Datapac (comprehensive system) and Dartfish (basic system).Main Outcome Measures:Amplitudes and velocities of the shoulder angle were calculated. Intraclass correlation coefficient (ICC), standard error of measurement (SEM), and levels of agreement (LOA) were used to determine intersystem and intertrial reliability.Results:For AE, the amplitude maximum (ICC = .98–.99, SEM = 2–3°, LOA = −9° to 5°) and average velocity (ICC = .94–.97, SEM = 1°/s, LOA = −4° to 1°/s) indicated excellent intersystem reliability between systems. Intratrial reliability for minimum velocity was moderate for Datapac (ICC = .64, SEM = 4°/s, LOA = 7°/s) and poor for Dartfish (ICC = .52, SEM = 20°/s, LOA = 37°/s). Cohort results demonstrated for AE a greater amplitude for healthy v SLAP (139° ± 11° v 113° ± 13°; P = .001) and interaction for an average velocity increase of 2°/s in healthy and decrease of 2°/s in SLAP patients over the 3 trials (P = .02).Conclusions:Reliability ranges provide the means to assess the clinical meaningfulness of results. The cohort differences are supported when the values exceed the ranges of the SEM; hence the amplitude results are meaningful. For dynamic shoulder elevation measured using video, the assessment of velocity was found to produce moderate to good reliability. The results suggest that with these measures subtle changes in both measures may be possible with further investigations.


2021 ◽  
pp. 152483992098479
Author(s):  
Joseph G. L. Lee ◽  
Mahdi Sesay ◽  
Paula A. Acevedo ◽  
Zachary A. Chichester ◽  
Beth H. Chaney

The quality of patient education materials is an important issue for health educators, clinicians, and community health workers. We describe a challenge achieving reliable scores between coders when using the Patient Educational Materials Assessment Tool (PEMAT) to evaluate farmworker health materials in spring 2020. Four coders were unable to achieve reliability after three attempts at coding calibration. Further investigation identified improvements to the PEMAT codebook and evidence of the difficulty of achieving traditional interrater reliability in the form of Krippendorff’s alpha. Our solution was to use multiple raters and average ratings to achieve an acceptable score with an intraclass correlation coefficient. Practitioners using the PEMAT to evaluate materials should consider averaging the scores of multiple raters as PEMAT results otherwise may be highly sensitive to who is doing the rating. Not doing so may inadvertently result in the use of suboptimal patient education materials.


Sign in / Sign up

Export Citation Format

Share Document