scholarly journals Assessing test–retest reliability of patient-reported outcome measures using intraclass correlation coefficients: recommendations for selecting and documenting the analytical formula

2018 ◽  
Vol 28 (4) ◽  
pp. 1029-1033 ◽  
Author(s):  
Shanshan Qin ◽  
Lauren Nelson ◽  
Lori McLeod ◽  
Sonya Eremenco ◽  
Stephen Joel Coons
Cephalalgia ◽  
2021 ◽  
pp. 033310242110345
Author(s):  
Raquel Gil-Gouveia ◽  
António Gouveia Oliveira

Background Several patient-reported outcome measures are available to monitor headache impact, but are those reliable in real-life clinical practice? Methods Two identical patient-reported outcome measures (HALT-90 and MIDAS) were applied simultaneously in each clinical visit to a series of patients treated with monoclonal antibodies for migraine and intra-individual agreement was evaluated using the intraclass correlation coefficients. Results Our sample included 92 patients, 92.4% females, 45 years old on average. Moderate (0.50 to 0.75) and even poor (<0.50) ICC were observed in all but the first item of these patient-reported outcome measures in at least one evaluation. Over time, missing data were more frequent and no learning effect was detected. Discussion We observed intra-personal variation in reliability when answering patient-reported outcome measures, persisting in repeated applications, and a decrease in the motivation to respond, which should alert clinicians for these additional challenges in real-life clinical practice.


2020 ◽  
Vol 33 (4) ◽  
pp. 480-489 ◽  
Author(s):  
Lazar Tosic ◽  
Elior Goldberger ◽  
Nicolai Maldaner ◽  
Marketa Sosnova ◽  
Anna M. Zeitlberger ◽  
...  

OBJECTIVEThe 6-minute walking test (6WT) is used to determine restrictions in a subject’s 6-minute walking distance (6WD) due to lumbar degenerative disc disease. To facilitate simple and convenient patient self-measurement, a free and reliable smartphone app using Global Positioning System coordinates was previously designed. The authors aimed to determine normative values for app-based 6WD measurements.METHODSThe maximum 6WD was determined three times using app-based measurement in a sample of 330 volunteers without previous spine surgery or current spine-related disability, recruited at 8 centers in 5 countries (mean subject age 44.2 years, range 16–91 years; 48.5% male; mean BMI 24.6 kg/m2, range 16.3–40.2 kg/m2; 67.9% working; 14.2% smokers). Subjects provided basic demographic information, including comorbidities and patient-reported outcome measures (PROMs): visual analog scale (VAS) for both low-back and lower-extremity pain, Core Outcome Measures Index (COMI), Zurich Claudication Questionnaire (ZCQ), and subjective walking distance and duration. The authors determined the test-retest reliability across three measurements (intraclass correlation coefficient [ICC], standard error of measurement [SEM], and mean 6WD [95% CI]) stratified for age and sex, and content validity (linear regression coefficients) between 6WD and PROMs.RESULTSThe ICC for repeated app-based 6WD measurements was 0.89 (95% CI 0.87–0.91, p < 0.001) and the SEM was 34 meters. The overall mean 6WD was 585.9 meters (95% CI 574.7–597.0 meters), with significant differences across age categories (p < 0.001). The 6WD was on average about 32 meters less in females (570.5 vs 602.2 meters, p = 0.005). There were linear correlations between average 6WD and VAS back pain, VAS leg pain, COMI Back and COMI subscores of pain intensity and disability, ZCQ symptom severity, ZCQ physical function, and ZCQ pain and neuroischemic symptoms subscores, as well as with subjective walking distance and duration, indicating that subjects with higher pain, higher disability, and lower subjective walking capacity had significantly lower 6WD (all p < 0.001).CONCLUSIONSThis study provides normative data for app-based 6WD measurements in a multicenter sample from 8 institutions and 5 countries. These values can now be used as reference to compare 6WT results and quantify objective functional impairment in patients with degenerative diseases of the spine using z-scores. The authors found a good to excellent test-retest reliability of the 6WT app, a low area of uncertainty, and high content validity of the average 6WD with commonly used PROMs.


2000 ◽  
Vol 9 (2) ◽  
pp. 117-123 ◽  
Author(s):  
Michael D. Ross ◽  
Elizabeth G. Fontenot

Context:The standing heel-rise test has been recommended as a means of assessing calf-muscle performance. To the authors' knowledge, the reliability of the test using intraclass correlation coefficients (ICCs) has not been reported.Objective:To determine the test-retest reliability of the standing heel-rise test.Design:Single-group repeated measures.Participants:Seventeen healthy subjects.Settings and Infevention:Each subject was asked to perform as many standing heel raises as possible during 2 testing sessions separated by 7 days.Main Outcome Measures:Reliability data for the standing heel-rise test were studied through a repeated-measures analysis of variance, ICC2, 1 and SEMs.Results:The ICC2,1 and SEM values for the standing heel-rise test were .96 and 2.07 repetitions, respectively.Conclusions:The standing heel-rise test offers clinicians a reliable assessment of calfmuscle performance. Further study is necessary to determine the ability of the standing heel-rise test to detect functional deficiencies in patients recovering from lower leg injury or surgery


2008 ◽  
Vol 22 (6) ◽  
pp. 737-744 ◽  
Author(s):  
I-Ping Hsueh ◽  
Miao-Ju Hsu ◽  
Ching-Fan Sheu ◽  
Su Lee ◽  
Ching-Lin Hsieh ◽  
...  

Objective. To provide empirical justification for selecting motor scales for stroke patients, the authors compared the psychometric properties (validity, responsiveness, test-retest reliability, and smallest real difference [SRD]) of the Fugl-Meyer Motor Scale (FM), the simplified FM (S-FM), the Stroke Rehabilitation Assessment of Movement instrument (STREAM), and the simplified STREAM (S-STREAM). Methods. For the validity and responsiveness study, 50 inpatients were assessed with the FM and the STREAM at admission and discharge to a rehabilitation department. The scores of the S-FM and the S-STREAM were retrieved from their corresponding scales. For the test-retest reliability study, a therapist administered both scales on a different sample of 60 chronic patients on 2 occasions. Results. Only the S-STREAM had no notable floor or ceiling effects at admission and discharge. The 4 motor scales had good concurrent validity (rho ≥ .91) and satisfactory predictive validity (rho = .72-.77). The scales showed responsiveness (effect size d ≥ 0.34; standardized response mean ≥ 0.95; P < .0001), with the S-STREAM most responsive. The test-retest agreements of the scales were excellent (intraclass correlation coefficients ≥ .96). The SRD of the 4 scales was 10% of their corresponding highest score, indicating acceptable level of measurement error. The upper extremity and the lower extremity subscales of the 4 showed similar results. Conclusions. The 4 motor scales showed acceptable levels of reliability, validity, and responsiveness in stroke patients. The S-STREAM is recommended because it is short, responsive to change, and able to discriminate patients with severe or mild stroke.


2020 ◽  
Vol 47 (4) ◽  
pp. 479-486
Author(s):  
Yuki Kondo ◽  
Kyota Bando ◽  
Yosuke Ariake ◽  
Wakana Katsuta ◽  
Kyoko Todoroki ◽  
...  

BACKGROUND: The reliability of the evaluation of the Balance Evaluation Systems Test (BESTest) and its two abbreviated versions are confirmed for balance characteristics and reliability. However, they are not utilized in cases of spinocerebellar ataxia (SCA). OBJECTIVE: We aimed to examine the test-retest reliability and minimal detectable change (MDC) of the BESTest and its abbreviated versions in persons with mild to moderate spinocerebellar ataxia. METHODS: The BESTest was performed in 20 persons with SCA at baseline and one month later. The scores of the abbreviated version of the BESTest were determined from the BESTest scores. The interclass correlation coefficient (1,1) was used as a measure of relative reliability. Furthermore, we calculated the MDC in the BESTest and its abbreviated versions. RESULTS: The intraclass correlation coefficients (1,1) and MDC at 95% confidence intervals were 0.92, 8.7(8.1%), 0.91, 4.1(14.5%), and 0.81, 5.2(21.6%) for the Balance, Mini-Balance, and Brief-Balance Evaluation Systems Tests, respectively. CONCLUSIONS: The BESTest and its abbreviated versions had high test-retest reliability. The MDC values of the BESTest could enable clinicians and researchers to interpret changes in the balance of patients with SCA more precisely.


2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Taher I. Omari ◽  
Johanna Savilampi ◽  
Karmen Kokkinn ◽  
Mistyka Schar ◽  
Kristin Lamvik ◽  
...  

Purpose. We evaluated the intra- and interrater agreement and test-retest reliability of analyst derivation of swallow function variables based on repeated high resolution manometry with impedance measurements.Methods. Five subjects swallowed10×10 mL saline on two occasions one week apart producing a database of 100 swallows. Swallows were repeat-analysed by six observers using software. Swallow variables were indicative of contractility, intrabolus pressure, and flow timing.Results. The average intraclass correlation coefficients (ICC) for intra- and interrater comparisons of all variable means showedsubstantialtoexcellentagreement (intrarater ICC 0.85–1.00; mean interrater ICC 0.77–1.00). Test-retest results were less reliable. ICC for test-retest comparisons ranged fromslighttoexcellentdepending on the class of variable. Contractility variables differed most in terms of test-retest reliability. Amongst contractility variables, UES basal pressure showedexcellenttest-retest agreement (mean ICC 0.94), measures of UES postrelaxation contractile pressure showedmoderatetosubstantialtest-retest agreement (mean Interrater ICC 0.47–0.67), and test-retest agreement of pharyngeal contractile pressure ranged fromslighttosubstantial(mean Interrater ICC 0.15–0.61).Conclusions. Test-retest reliability of HRIM measures depends on the class of variable. Measures of bolus distension pressure and flow timing appear to be more test-retest reliable than measures of contractility.


2021 ◽  
pp. 1-4
Author(s):  
Jamon Couch ◽  
Marc Sayers ◽  
Tania Pizzari

Context: An imbalance between shoulder internal rotation (IR) and external rotation (ER) strength in athletes is proposed to increase the risk of sustaining a shoulder injury. Hand-held (HHD) and externally fixed dynamometry are reliable forms of assessing shoulder IR and ER strength. A new externally fixed device with an attachable fixed upper-limb mold (The ForceFrame) exists; however, its reliability in measuring shoulder strength is yet to be investigated. Objective: To determine the test–retest reliability of the ForceFrame, with and without the fixed upper-limb mold, in the assessment of shoulder IR and ER strength, as compared with HHD. Design: Test–retest reliability study. Setting: Laboratory, clinical. Participants: Twenty-two healthy and active individuals were recruited from the university community and a private physiotherapy practice. Main Outcome Measures: Maximal isometric shoulder IR and ER strength was measured using the ForceFrame and traditional HHD in neutral and at 90° shoulder abduction. Mean (SD) strength measures were calculated. Test–retest reliability was analyzed using intraclass correlation coefficients (3, 1). The SEM and minimal detectable change were calculated. Results: Good to excellent test–retest reliability was found for all shoulder strength tests across Hand-held dynamometry (HHD) and externally fixed dynamometry (EFD) are reliable forms (intraclass correlation coefficients [3, 1] = .854–.916). The minimal detectable changes ranged between 25.61 and 41.84 N across tests. Test–retest reliability was not affected by the dynamometer or testing position. Conclusions: The results from this study indicate that both the ForceFrame and HHD are suitable for measuring shoulder strength in clinical practice. The use of the fixed upper-limb mold with the ForceFrame does not improve reliability.


2020 ◽  
Author(s):  
Jennifer Y Yu ◽  
Talia Goldberg ◽  
Nicholas Lao ◽  
Brian M Feldman ◽  
Y. Ingrid Goh

Abstract BACKGROUND: Patient reported outcome measures (PROMs) provide valuable information to the healthcare team regarding their patient’s health status. Identified dilemmas of PROMs completed on paper include missing answers, incorrectly scored answers, and missing questionnaires. Collecting PROMS on an electronic platform can potentially avoid these challenges. The objectives of this project were to a) determine the equivalence of the paper and electronic forms (e-form) of Childhood Health Assessment Questionnaire (CHAQ) and Quality of My Life (QoML) questionnaires; b) identify potential benefits and barriers associated with using an e-form to capture patient reported outcome measures (PROMs); and c) gather feedback on user experience. METHODS: Participants completed both a paper and an e-form of the questionnaires in a randomized order, following which they completed a feedback survey. Agreement of the scores between the forms were statistically analyzed using the intraclass correlation coefficient (95% CI) and bias was assessed using a Bland-Altman plot. Completion and processing times of the forms were compared using mean and median measures. Quantitative analysis was performed to assess user experience ratings, while comments were qualitatively analyzed to identify important themes.RESULTS: 196 patients participated in this project. Scores on the forms had high intraclass correlation coefficient (ICC) agreement >0.9. New patients took longer than returning patients to complete the forms. Overall, the e-form was completed and processed in a shorter amount of time than the paper form. 83% of survey respondents indicated that they either preferred the e-form or had no preference. Approximately 10% of respondents suggested improvements to improve the user interface.CONCLUSIONS: E-forms collect comparable information in an efficient manner to paper forms. Given that patients and caregivers indicated they preferred completing PROMs in this manner, we will implement their suggested changes and incorporate e-forms as standard practice for PROMs collection in our pediatric rheumatology clinic.


2021 ◽  
Author(s):  
Jennifer Y Yu ◽  
Talia Goldberg ◽  
Nicholas Lao ◽  
Brian M Feldman ◽  
Y. Ingrid Goh

Abstract BACKGROUND: Patient reported outcome measures (PROMs) provide valuable insight on patients’ well-being and facilitates communication between healthcare providers and their patients. The increased integration of the technology within the healthcare setting presents the opportunity to collect PROMs electronically, rather than on paper. The Childhood Health Assessment Questionnaire (CHAQ) and Quality of My Life (QoML) are common PROMs collected from pediatric rheumatology patients. The objectives of this study are to a) determine the equivalence of the paper and electronic forms (e-form) of CHAQ and QoML questionnaires; b) identify potential benefits and barriers associated with using an e-form to capture PROMs; and c) gather feedback on user experience. METHODS: Participants completed both a paper and an e-form of the questionnaires in a randomized order, following which they completed a feedback survey. Agreement of the scores between the forms were statistically analyzed using the intraclass correlation coefficient (ICC) (95% Confidence Interval (CI)) and bias was assessed using a Bland-Altman plot. Completion and processing times of the forms were compared using mean and median measures. Quantitative analysis was performed to assess user experience ratings, while comments were qualitatively analyzed to identify important themes.RESULTS: 196 patients participated in this project. Scores on the forms had high ICC agreement >0.9. New patients took longer than returning patients to complete the forms. Overall, the e-form was completed and processed in a shorter amount of time than the paper form. 83% of survey respondents indicated that they either preferred the e-form or had no preference. Approximately 10% of respondents suggested improvements to improve the user interface.CONCLUSIONS: E-forms collect comparable information in an efficient manner to paper forms. Given that patients and caregivers indicated they preferred completing PROMs in this manner, we will implement their suggested changes and incorporate e-forms as standard practice for PROMs collection in our pediatric rheumatology clinic.


Sign in / Sign up

Export Citation Format

Share Document