Test-Retest and Interrater Reliability of the Functional Movement Screen

Context: The Functional Movement Screen (FMS) is a popular test to evaluate the degree of painful, dysfunctional, and asymmetric movement patterns. Despite great interest in the FMS, test-retest reliability data have not been published. Objective: To assess the test-retest and interrater reliability of the FMS and to compare the scoring by 1 rater during a live session and the same session on video. Design: Cross-sectional study. Setting: Human performance laboratory in the sports medicine center. Patients or Other Participants: A total of 21 female (age = 19.6 ± 1.5 years, height = 1.7 ± 0.1 m, mass = 64.4 ± 5.1 kg) and 18 male (age = 19.7 ± 1.0 years, height = 1.9 ± 0.1 m, mass = 80.1 ± 9.9 kg) National Collegiate Athletic Association Division IA varsity athletes volunteered. Intervention(s): Each athlete was tested and retested 1 week later by the same rater who also scored the athlete's first session from a video recording. Five other raters scored the video from the first session. Main Outcome Measure(s): The Krippendorff α (K α) was used to assess the interrater reliability, whereas intraclass correlation coefficients (ICCs) were used to assess the test-retest reliability and reliability of live-versus-video scoring. Results: Good reliability was found for the test-retest (ICC = 0.6), and excellent reliability was found for the live-versus-video sessions (ICC = 0.92). Poor reliability was found for the interrater reliability (K α = .38). Conclusions: The good test-retest and high live-versus-video session reliability show that the FMS is a usable tool within 1 rater. However, the low interrater K α values suggest that the FMS within the limits of generalization should not be used indiscriminately to detect deficiencies that place the athlete at greater risk for injury. The FMS interrater reliability may be improved with better training for the rater.

Download Full-text

Reliability of the Polar Vantage M Sports Watch when Measuring Heart Rate at Different Treadmill Exercise Intensities

Sports ◽

10.3390/sports8090117 ◽

2020 ◽

Vol 8 (9) ◽

pp. 117

Author(s):

Mike Climstein ◽

Jessica L. Alder ◽

Alyce M. Brooker ◽

Elissa J. Cartwright ◽

Kevin Kemp-Smith ◽

...

Keyword(s):

Heart Rate ◽

Treadmill Exercise ◽

Intraclass Correlation ◽

Good Test ◽

Good Reliability ◽

Measurement Variability ◽

Retest Reliability ◽

Activity Monitors ◽

Test Retest Reliability ◽

Standard Error Of Measurement

Background: Usage of wrist-worn activity monitors has rapidly increased in recent years, and these devices are being used by both fitness enthusiasts and in clinical populations. We, therefore, assessed the test–retest reliability of the Polar Vantage M (PVM) watch when measuring heart rate (HR) during various treadmill exercise intensities. Methods: HR was measured every 30 s (simultaneous electrocardiography (ECG) and PVM). Test–retest reliability was determined using an intraclass correlation coefficient (ICC) with 95% confidence intervals (CIs). Standard error of measurement (SEM) and smallest real difference (SRD) were used to determine measurement variability. Results: A total of 29 participants completed the trials. ICC values for PVM during stages 1, 2 and 5 demonstrated good to excellent test–retest reliability (0.78, 0.78 and 0.92; 95% CI (0.54–0.90, 0.54–0.9, 0.79–0.97)). For PVM during stages 0 (rest), 3 and 4, the ICC values indicated poor to good reliability (0.42, 0.68 and 0.58; 95% CI (−0.27–0.73, 0.32–0.85, 0.14–0.80)). Conclusion: This study identified that the test–retest reliability of the PVM was comparable at low and high exercise intensities; however, it revealed a poor to good test–retest reliability at moderate intensities. The PVM should not be used in a clinical setting where monitoring of an accurate HR is crucial to the patients’ safety.

Download Full-text

A new visual feedback-based system for the assessment of pinch force, endurance, accuracy and precision. A test-retest reliability study

Hand Therapy ◽

10.1177/17589983211002550 ◽

2021 ◽

pp. 175899832110025

Author(s):

Alberto Dottor ◽

Eleonora Camerone ◽

Mirko Job ◽

Diletta Barbiani ◽

Elisa Frisaldi ◽

...

Keyword(s):

Visual Feedback ◽

Intraclass Correlation ◽

Precision Grip ◽

Voluntary Contraction ◽

Good Reliability ◽

Retest Reliability ◽

Relative Reliability ◽

Time Accuracy ◽

In The Beginning ◽

Test Retest Reliability

Introduction Given that pinch is a precision grip involved in sustained submaximal activities, a Sustained Contraction (SC) task could be associated to Maximal Voluntary Contraction (MVC). To better evaluate the thumb-index system, the test-retest reliability of pinch MVC and SC, measured by a visual feedback-based pinch gauge was assessed. Methods 26 healthy participants performed MVC and SC in two separate sessions. SC required to maintain 40%MVC as long as possible and it was evaluated in terms of time, accuracy (Mean Distance between force trace and target force, MD), precision (Coefficient of Variability of force trace, CV). MD and CV analyses were conducted dividing the SC task into three equivalent time stages (beginning, middle, exhaustion). Relative Reliability (RR) was measured by Intraclass Correlation Coefficient, and Absolute Reliability (AR) was measured by Standard Error of Measurement and by Bland-Altman plot. Results MVC and Time showed high RR and AR in both hands. RR of MD and CV in right hand was excellent in the beginning and middle stages, and fair in the exhaustion one, showing decreasing reliability as fatigue increases. In the left hand RR of MD and CV was generally lower. MD showed excellent reliability in the beginning stage and good reliability in the other stages. CV showed fair relative reliability at both beginning and middle stages, excellent in the last one. Conversely, it was observed high AR of MD and CV in all stages in both hands. Conclusions All indices are reliable to assess motor control of thumb-index pinch in both hands.

Download Full-text

Reliability and Validity of Turkish Myasthenia Gravis-Activities of Daily Living Scale

OTJR Occupation Participation and Health ◽

10.1177/1539449220961077 ◽

2020 ◽

pp. 153944922096107

Author(s):

Ecem Karanfil ◽

Yeliz Salcı ◽

Ayla Fil-Balkan ◽

Can Ebru Bekircan-Kurt ◽

Sevim Erdem Özdamar ◽

...

Keyword(s):

Construct Validity ◽

Myasthenia Gravis ◽

Activities Of Daily Living ◽

Daily Living ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Life Questionnaire ◽

Good Test ◽

Retest Reliability ◽

Test Retest Reliability

Linguistic, reliable, and valid secondary efficacy measures are important in clinical settings and studies. The aim of the study is to report test–retest reliability and construct validity of Turkish version of Myasthenia Gravis-Activities of Daily Living Scale (MG-ADL-T) in Myasthenia Gravis (MG) patients. Fifty-two ocular and generalized individuals with MG, applying to rehabilitation center, were included in the study. MG-ADL-T, MG quality-of-life questionnaire (MG-QoL), MG composite (MGC), quantitative MG score (QMGS), and pulmonary function test were administered. Reliability was assessed with intraclass correlation coefficient (ICC) and Cronbach’s alpha. Spearman correlation test and receiver operating characteristic (ROC) analysis were performed for construct validity. MG-ADL-T had fair internal consistency (Cronbach’s α = .67), excellent test–retest reliability (ICC = 0.96) and moderate construct validity (MG-QoL, r = 0.59; QMGS, r = .58; MGC, r = .68). MG-ADL, a unique scale that evaluates activities of daily living (ADL), has good test–retest reliability and construct validity in Turkish MG patients.

Download Full-text

Reliability of Safe Maximum Lifting Determinations of a Functional Capacity Evaluation

Physical Therapy ◽

10.1093/ptj/82.4.364 ◽

2002 ◽

Vol 82 (4) ◽

pp. 364-371 ◽

Cited By ~ 73

Author(s):

Douglas P Gross ◽

Michele C Battié

Keyword(s):

Functional Capacity ◽

Repeated Measures ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Functional Capacity Evaluation ◽

Measurement Variability ◽

Retest Reliability ◽

Repeated Measures Design ◽

Test Retest Reliability

Abstract Background and Purpose. Functional capacity evaluations (FCEs) are measurement tools used in predicting readiness to return to work following injury. The interrater and test-retest reliability of determinations of maximal safe lifting during kinesiophysical FCEs were examined in a sample of people who were off work and receiving workers' compensation. Subjects. Twenty-eight subjects with low back pain who had plateaued with treatment were enrolled. Five occupational therapists, trained and experienced in kinesiophysical methods, conducted testing. Methods. A repeated-measures design was used, with raters testing subjects simultaneously, yet independently. Subjects were rated on 2 occasions, separated by 2 to 4 days. Analyses included intraclass correlation coefficients (ICCs) and 95% confidence intervals. Results. The ICC values for interrater reliability ranged from .95 to .98. Test-retest values ranged from .78 to .94. Discussion and Conclusion. Inconsistencies in subjects' performance across sessions were the greatest source of FCE measurement variability. Overall, however, test-retest reliability was good and interrater reliability was excellent.

Download Full-text

Test-Retest Reliability of the StepWatch Activity Monitor Outputs in Healthy Adults

Journal of Physical Activity and Health ◽

10.1123/jpah.7.5.671 ◽

2010 ◽

Vol 7 (5) ◽

pp. 671-676 ◽

Cited By ~ 27

Author(s):

Suzie Mudge ◽

Denise Taylor ◽

Oliver Chang ◽

Rosita Wong

Keyword(s):

Activity Index ◽

Intraclass Correlation ◽

Objective Measure ◽

Activity Monitor ◽

Healthy Adults ◽

Peak Activity ◽

Good Test ◽

Retest Reliability ◽

Step Rate ◽

Test Retest Reliability

Background:Activity Monitors give an objective measure of usual walking performance. This study aimed to examine the test-retest reliability of the StepWatch Activity Monitor outputs (mean steps/day; peak activity index; sustained activity indices of 1, 5, 20, 30, 60 minutes; steps at high, medium, and low stepping rates).Methods:Thirty healthy adults age 18 to 49 years wore the StepWatch for 2 3-day periods at least 1 week apart.Results:The intraclass correlation coefficients of the StepWatch outputs ranged from 0.44 to 0.91 over 3 days. The coefficient of variation ranged from 3.0% to 51.3% over the monitoring periods, with higher variation shown for shorter monitoring periods. The most reliable 5 outputs had 95% limits of agreement between 3-day periods that were less than 40%. These were mean steps/day (±39.1%), highest step rate in 1 (±17.3%) and 5 (±37.4%) minutes, peak activity index (±25.6%), and percentage of inactive time (±9.52%).Conclusions:Mean steps/day, highest step rate in 1 and 5 minutes, peak activity index, and percentage of inactive time have good test-retest reliability over a 3-day monitoring period, with lower reliability shown by the other StepWatch outputs. Monitoring over 1 or 2 days is less reliable.

Download Full-text

Reliability of Subjective Pain Ratings and Nociceptive Flexion Reflex Responses as Measures of Conditioned Pain Modulation

Pain Research and Management ◽

10.1155/2014/698246 ◽

2014 ◽

Vol 19 (2) ◽

pp. 93-96 ◽

Cited By ~ 22

Author(s):

Carlo Jurth ◽

Benno Rehberg ◽

Falk von Dincklage

Keyword(s):

Intraclass Correlation ◽

Pain Modulation ◽

Hot Water ◽

Conditioned Pain Modulation ◽

Good Test ◽

Retest Reliability ◽

Reflex Responses ◽

Pain Ratings ◽

Subjective Pain ◽

Test Retest Reliability

BACKGROUND: The endogenous modulation of pain can be assessed through conditioned pain modulation (CPM), which can be quantified using subjective pain ratings or nociceptive flexion reflexes. However, to date, the test-retest reliability has only been investigated for subjective pain ratings.OBJECTIVE: To compare the test-retest reliability of CPM-induced changes, measured using subjective pain ratings and nociceptive flexion reflexes, to provide a reliable scoring parameter for future studies.METHOD: A total of 40 healthy volunteers each received painful electrical stimuli to the sural nerve to elicit nociceptive flexion reflexes. Reflex sizes and subjective pain ratings were recorded before and during the immersion of the contralateral hand in hot water to induce CPM as well as innocuous water as control. Measurements were repeated in a retest 28 days later.RESULTS: Intraclass correlation coefficients showed good test-retest reliabilities of CPM during the hot water stimulus for both scoring parameters. Subjective pain ratings also correlated between test and retest during the control stimulus.CONCLUSIONS: Subjective pain ratings and nociceptive flexion reflexes show comparable test-retest reliabilities, but they reflect different components of CPM. While subjective pain ratings appear to incorporate cognitive influences to a larger degree, reflex responses appear to reflect spinal nociception more purely.

Download Full-text

Reliability of the performance-based measure of executive functions in people with schizophrenia

BMC Psychiatry ◽

10.1186/s12888-021-03562-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

En-Chi Chiu ◽

Ya-Chen Lee ◽

Shu-Chun Lee ◽

I-Ping Hsueh

Keyword(s):

Executive Functions ◽

Intraclass Correlation ◽

Minimal Detectable Change ◽

Rater Agreement ◽

Good Test ◽

Retest Reliability ◽

Purposive Action ◽

And Performance ◽

Agreement Study ◽

Test Retest Reliability

Abstract Background The Performance-based measure of Executive Functions (PEF) with four domains is designed to assess executive functions in people with schizophrenia. The purpose of this study was to examine the test-retest reliability of the PEF administered by the same rater (intra-rater agreement) and by different raters (inter-rater agreement) in people with schizophrenia and to estimate the values of minimal detectable change (MDC) and MDC%. Methods Two convenience samples (each sample, n = 60) with schizophrenia were conducted two assessments (two weeks apart). The intraclass correlation coefficient (ICC) was analyzed to examine intra-rater and inter-rater agreements of the test-retest reliability of the PEF. The MDC was calculated through standard error of measurement. Results For the intra-rater agreement study, the ICC values of the four domains were 0.88–0.92. The MDC (MDC%) of the four domains (volition, planning, purposive action, and perfromance effective) were 13.0 (13.0%), 12.2 (16.4%), 16.2 (16.2%), and 16.3 (18.8%), respectively. For the inter-rater agreement study, the ICC values of the four domains were 0.82–0.89. The MDC (MDC%) were 15.8 (15.8%), 17.4 (20.0%), 20.9 (20.9%), and 18.6 (18.6%) for the volition, planning, purposive action, and performance effective domains, respectively. Conclusions The PEF has good test-retest reliability, including intra-rater and inter-rater agreements, for people with schizophrenia. Clinicians and researchers can use the MDC values to verify whether an individual with schizophrenia shows any real change (improvement or deterioration) between repeated PEF assessments by the same or different raters.

Download Full-text

Test–Retest Reliability of a New Device Versus a Long-Arm Goniometer to Evaluate Knee Proprioception

Journal of Sport Rehabilitation ◽

10.1123/jsr.2021-0146 ◽

2021 ◽

pp. 1-6

Author(s):

Fei Tian ◽

Yaqi Zhao ◽

Jixin Li ◽

Wenjin Wang ◽

Danni Wu ◽

...

Keyword(s):

Standard Error ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Joint Position Sense ◽

Repeated Measurements ◽

Good Reliability ◽

Retest Reliability ◽

New Device ◽

Intraclass Correlation Coefficients ◽

Test Retest Reliability

Context: Many methods used to evaluate knee proprioception have shortcomings that limit their use in clinical settings. Based on an inexpensive 3D camera, a new portable device was recently used to evaluate the joint position sense (JPS) of the knee joint. However, the test–retest reliability of the new method remains unclear. This study aimed to evaluate the test–retest reliability of the new device and a long-arm goniometer for assessing knee JPS, and to compare the variability of the 2 methods. Design: Prospective observational study of the test–retest reliability of knee JPS measurements. Methods: Twenty-one healthy adults were tested in 2 sessions with a 1-week interval. Three target knee flexion angles (30°, 45°, and 60°) were reproduced in each session. Target and reproduced angles were measured with both methods. Intraclass correlation coefficients, standard error of the measurement, and Bland–Altman plots were used to quantify test–retest reliability. Paired t tests were used to compare knee JPS (absolute error of the target-reproduced angle) between the methods. Results: The new device (good to excellent intraclass correlation coefficients .74–.80; standard error of the measurement 0.52°–0.61°) demonstrated better test–retest reliability than the goniometer (poor to fair intraclass correlation coefficients .23–.43; standard error of the measurement 0.89°–2.07°) and better test–retest agreement (respective mean differences for the 30°, 45°, and 60° knee angles: 0.11°, 0.13°, and 0.41° for the new system; 0.84°, 1.52°, and 1.18° for the goniometer). The measurements (absolute errors of the target-reproduced angles) with the goniometer were significantly greater than those with the new device (P < .05); the SDs of repeated measurements with the goniometer (1.50°–2.41°) were greater than with the new device (1.08°–1.38°). Conclusions: Given that the new device has good reliability and sufficient precision, it is the better alternative for evaluating knee JPS. Goniometers should be used with caution to assess knee JPS.

Download Full-text

Test-retest reliability of the human functional connectome over consecutive days: identifying highly reliable portions and assessing the impact of methodological choices

Network Neuroscience ◽

10.1162/netn_a_00148 ◽

2020 ◽

Vol 4 (3) ◽

pp. 925-945

Author(s):

Leonardo Tozzi ◽

Scott L. Fleming ◽

Zachary D. Taylor ◽

Cooper D. Raterink ◽

Leanne M. Williams

Keyword(s):

Human Brain ◽

Functional Neuroimaging ◽

Intraclass Correlation ◽

Good Reliability ◽

Retest Reliability ◽

Methodological Choices ◽

Functional Connectome ◽

Global Signal ◽

Test Retest Reliability ◽

The Impact

Countless studies have advanced our understanding of the human brain and its organization by using functional magnetic resonance imaging (fMRI) to derive network representations of human brain function. However, we do not know to what extent these “functional connectomes” are reliable over time. In a large public sample of healthy participants ( N = 833) scanned on two consecutive days, we assessed the test-retest reliability of fMRI functional connectivity and the consequences on reliability of three common sources of variation in analysis workflows: atlas choice, global signal regression, and thresholding. By adopting the intraclass correlation coefficient as a metric, we demonstrate that only a small portion of the functional connectome is characterized by good (6–8%) to excellent (0.08–0.14%) reliability. Connectivity between prefrontal, parietal, and temporal areas is especially reliable, but also average connectivity within known networks has good reliability. In general, while unreliable edges are weak, reliable edges are not necessarily strong. Methodologically, reliability of edges varies between atlases, global signal regression decreases reliability for networks and most edges (but increases it for some), and thresholding based on connection strength reduces reliability. Focusing on the reliable portion of the connectome could help quantify brain trait-like features and investigate individual differences using functional neuroimaging.

Download Full-text

The Mandarin Voice Handicap Index for Laryngectomees With Pneumatic Artificial Laryngeal and Esophageal Speech

American Journal of Speech-Language Pathology ◽

10.1044/2021_ajslp-20-00241 ◽

2021 ◽

pp. 1-12

Author(s):

Chen-Chi Wang ◽

Jia-Shiou Liao ◽

Hsiu-Chin Lai ◽

Yi-Hsuan Lo

Keyword(s):

Internal Consistency ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Voice Handicap Index ◽

Good Test ◽

Retest Reliability ◽

Related Quality ◽

Valid Instrument ◽

Test Retest Reliability ◽

The Impact

Purpose The Voice Handicap Index (VHI) questionnaire assesses the impact of total laryngectomy on voice-related quality of life. This study evaluates the Mandarin VHI, including its internal consistency, test–retest reliability, content validity, and differences in scores for Mandarin alaryngeal patients with pneumatic artificial laryngeal (PA) and esophageal (ES) speech. Method Translation and validation of the VHI questionnaire was performed through the forward–backward translation technique. This study used a sample of 78 PA and 23 ES participants from Taiwan who completed the Mandarin VHI. Forty-two of the alaryngeal participants completed the Mandarin VHI twice over a period of 7–63 days. Results The measurement of the internal consistency of the Mandarin VHI showed a high Cronbach's alpha coefficient for the total score (.975) and the functional (.930), physical (.939), and emotional (.938) subscales. Based on the results of the intraclass correlation coefficients, good test–retest reliability for the total and domain scores was found (intraclass correlation coefficient = .827–.863). Conclusion The Mandarin VHI was validated as an instrument with proper internal consistency and reliability, which supports the Mandarin VHI as a valid instrument for the self-evaluation of handicaps related to voice problems in PA and ES speakers.

Download Full-text