A reliable and valid measure of COVID-19 patient-reported symptoms in outpatients: the Symptoms Evolution of COVID-19 (SE-C19) instrument

Mapping Intimacies ◽

10.1101/2021.12.16.21267708 ◽

2021 ◽

Author(s):

Diana Rofail ◽

Pip Griffiths ◽

Giulio Flore ◽

Mohamed Hussein ◽

Sumathi Sivapalasingam ◽

...

Keyword(s):

Intraclass Correlation ◽

Good Test ◽

Symptom Resolution ◽

Patient Global Impression ◽

New Instrument ◽

Patient Reported ◽

Item Scores ◽

Definition Of ◽

Test Retest Reliability ◽

Low Prevalence

Background: There is no valid and reliable patient self-reported measure assessing symptomology among outpatients with COVID-19. The Symptoms Evolution of COVID-19 (SE-C19) is a self-administered new instrument that includes 23 symptoms, each rated for severity at their worst moment within the last 24 hours. We studied the psychometric properties of SE-C19. Methods: Reliability, validity, and sensitivity to change of the SE-C19 were assessed in 657 outpatients with confirmed COVID-19 enrolled in NCT04425629. SE-C19 and Patient Global Impression of Severity (PGIS) were administered daily from baseline (predose at Day 1) to end of study (Day 29). Findings: Most patients (70.0%) were aged ≤50 years and white (85.5%). At baseline, patients reported an average (SD) of 6.6 (3.9) symptoms (ie, rated as at least Mild) with 3.8 (3.3) of these symptoms being rated as Moderate or Severe. By Day 29, most symptoms had resolved; 74.4% of patients reported no symptoms and on average, only 0.6 (SD 1.5) were reported as at least Mild. Stable patients according to the PGIS showed scores with intraclass correlation values indicating moderate-to-good test-retest reliability (ie, 0.50-0.90). At baseline, 20 item scores (87%) varied significantly across PGIS defined groups supporting the validity of SE-C19. A symptom resolution endpoint was defined after excluding the item 'Sneezing', due to its low ability to discriminate severity levels, and 'Confusion', 'Rash', and 'Vomiting', due to their low prevalence in this population. Symptoms resolution required complete absence of all remaining items, except 'Cough', 'Fatigue', and 'Headache', which could be Mild or Moderate in severity. Interpretation: We identified 19 items that are valid and reliable to measure disease-related symptoms in COVID-19 outpatients and propose a definition of symptom resolution that could be used in future clinical trials and potentially, also in clinical practice.

Download Full-text

Validation of a menstrual pictogram and a daily bleeding diary for assessment of uterine fibroid treatment efficacy in clinical studies

Journal of Patient-Reported Outcomes ◽

10.1186/s41687-020-00263-0 ◽

2020 ◽

Vol 4 (1) ◽

Author(s):

Claudia Haberland ◽

Anna Filonenko ◽

Christian Seitz ◽

Matthias Börner ◽

Christoph Gerlinger ◽

...

Keyword(s):

Uterine Fibroid ◽

Full Range ◽

Intraclass Correlation ◽

Phase Iii ◽

Measurement Properties ◽

Retest Reliability ◽

Response Options ◽

Patient Global Impression ◽

Patient Reported ◽

Test Retest Reliability

Abstract Background To evaluate the psychometric and measurement properties of two patient-reported outcome instruments, the menstrual pictogram superabsorbent polymer-containing version 3 (MP SAP-c v3) and Uterine Fibroid Daily Bleeding Diary (UF-DBD). Test-retest reliability, criterion, construct validity, responsiveness, missingness and comparability of the MP SAP-c v3 and UF-DBD versus the alkaline hematin (AH) method and a patient global impression of severity (PGI-S) were analyzed in post hoc trial analyses. Results Analyses were based on data from up to 756 patients. The full range of MP SAP-c v3 and UF-DBD response options were used, with score distributions reflecting the cyclic character of the disease. Test-retest reliability of MP SAP-c v3 and UF-DBD scores was supported by acceptable intraclass correlation coefficients when stability was defined by the AH method and Patient Global Impression of Severity (PGI-S) scores (0.80–0.96 and 0.42–0.94, respectively). MP SAP-c v3 and UF-DBD scores demonstrated strong and moderate-to-strong correlations with menstrual blood loss assessed by the AH method. Scores increased in monotonic fashion, with greater disease severities, defined by the AH method and PGI-S scores; differences between groups were mostly statistically significant (P < 0.05). MP SAP-c v3 and UF-DBD were sensitive to changes in disease severity, defined by the AH method and PGI-S. MP SAP-c v3 and UF-DBD showed a lower frequency of missing patient data versus the AH method, and good agreement with the AH method. Conclusions This evidence supports the use of the MP SAP-c v3 and UF-DBD to assess clinical efficacy endpoints in UF phase III studies replacing the AH method.

Download Full-text

Cross-cultural translation and validation of the Spanish version of the patellofemoral pain and osteoarthritis subscale of the KOOS (KOOS-PF)

BMC Research Notes ◽

10.1186/s13104-021-05619-3 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Juan Pablo Martinez-Cano ◽

Daniel Vernaza-Obando ◽

Julián Chica ◽

Andrés Mauricio Castro

Keyword(s):

Intraclass Correlation ◽

Patellofemoral Pain ◽

Spanish Version ◽

Minimal Detectable Change ◽

Global Rating ◽

Minimal Important Change ◽

Specific Patient ◽

Retest Reliability ◽

Patient Reported ◽

Test Retest Reliability

Abstract Objective The aim of this study was to translate to Spanish the patellofemoral pain and osteoarthritis subscale of the knee injury and osteoarthritis outcome score (KOOS-PF) and validate this Spanish version of a disease-specific patient-reported outcome measure (PROM) for patellofemoral pain. Results The KOOS-PF was translated to Spanish and sixty patients with patellofemoral pain and/or osteoarthritis accepted to complete the questionnaire. 1-week later 58 patients answered the questions again for the test–retest reliability validation and finally 55 patients completed 1-month later for the responsiveness assessment. The Spanish version showed very good internal consistency (Cronbach’s alpha: 0.93) and test–retest reliability (intraclass correlation coefficient: 0.82). Responsiveness was confirmed, showing a strong correlation with the global rating of change (GROC) score (r 0.64). The minimal detectable change was 11.1 points, the minimal important change was 17.2 points, and there were no floor or ceiling effects for the score.

Download Full-text

Assessing test–retest reliability of patient-reported outcome measures using intraclass correlation coefficients: recommendations for selecting and documenting the analytical formula

Quality of Life Research ◽

10.1007/s11136-018-2076-0 ◽

2018 ◽

Vol 28 (4) ◽

pp. 1029-1033 ◽

Cited By ~ 18

Author(s):

Shanshan Qin ◽

Lauren Nelson ◽

Lori McLeod ◽

Sonya Eremenco ◽

Stephen Joel Coons

Keyword(s):

Outcome Measures ◽

Intraclass Correlation ◽

Analytical Formula ◽

Correlation Coefficients ◽

Patient Reported Outcome Measures ◽

Patient Reported Outcome ◽

Retest Reliability ◽

Intraclass Correlation Coefficients ◽

Patient Reported ◽

Test Retest Reliability

Download Full-text

Reliability and Validity of Turkish Myasthenia Gravis-Activities of Daily Living Scale

OTJR Occupation Participation and Health ◽

10.1177/1539449220961077 ◽

2020 ◽

pp. 153944922096107

Author(s):

Ecem Karanfil ◽

Yeliz Salcı ◽

Ayla Fil-Balkan ◽

Can Ebru Bekircan-Kurt ◽

Sevim Erdem Özdamar ◽

...

Keyword(s):

Construct Validity ◽

Myasthenia Gravis ◽

Activities Of Daily Living ◽

Daily Living ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Life Questionnaire ◽

Good Test ◽

Retest Reliability ◽

Test Retest Reliability

Linguistic, reliable, and valid secondary efficacy measures are important in clinical settings and studies. The aim of the study is to report test–retest reliability and construct validity of Turkish version of Myasthenia Gravis-Activities of Daily Living Scale (MG-ADL-T) in Myasthenia Gravis (MG) patients. Fifty-two ocular and generalized individuals with MG, applying to rehabilitation center, were included in the study. MG-ADL-T, MG quality-of-life questionnaire (MG-QoL), MG composite (MGC), quantitative MG score (QMGS), and pulmonary function test were administered. Reliability was assessed with intraclass correlation coefficient (ICC) and Cronbach’s alpha. Spearman correlation test and receiver operating characteristic (ROC) analysis were performed for construct validity. MG-ADL-T had fair internal consistency (Cronbach’s α = .67), excellent test–retest reliability (ICC = 0.96) and moderate construct validity (MG-QoL, r = 0.59; QMGS, r = .58; MGC, r = .68). MG-ADL, a unique scale that evaluates activities of daily living (ADL), has good test–retest reliability and construct validity in Turkish MG patients.

Download Full-text

Reliability of the Polar Vantage M Sports Watch when Measuring Heart Rate at Different Treadmill Exercise Intensities

Sports ◽

10.3390/sports8090117 ◽

2020 ◽

Vol 8 (9) ◽

pp. 117

Author(s):

Mike Climstein ◽

Jessica L. Alder ◽

Alyce M. Brooker ◽

Elissa J. Cartwright ◽

Kevin Kemp-Smith ◽

...

Keyword(s):

Heart Rate ◽

Treadmill Exercise ◽

Intraclass Correlation ◽

Good Test ◽

Good Reliability ◽

Measurement Variability ◽

Retest Reliability ◽

Activity Monitors ◽

Test Retest Reliability ◽

Standard Error Of Measurement

Background: Usage of wrist-worn activity monitors has rapidly increased in recent years, and these devices are being used by both fitness enthusiasts and in clinical populations. We, therefore, assessed the test–retest reliability of the Polar Vantage M (PVM) watch when measuring heart rate (HR) during various treadmill exercise intensities. Methods: HR was measured every 30 s (simultaneous electrocardiography (ECG) and PVM). Test–retest reliability was determined using an intraclass correlation coefficient (ICC) with 95% confidence intervals (CIs). Standard error of measurement (SEM) and smallest real difference (SRD) were used to determine measurement variability. Results: A total of 29 participants completed the trials. ICC values for PVM during stages 1, 2 and 5 demonstrated good to excellent test–retest reliability (0.78, 0.78 and 0.92; 95% CI (0.54–0.90, 0.54–0.9, 0.79–0.97)). For PVM during stages 0 (rest), 3 and 4, the ICC values indicated poor to good reliability (0.42, 0.68 and 0.58; 95% CI (−0.27–0.73, 0.32–0.85, 0.14–0.80)). Conclusion: This study identified that the test–retest reliability of the PVM was comparable at low and high exercise intensities; however, it revealed a poor to good test–retest reliability at moderate intensities. The PVM should not be used in a clinical setting where monitoring of an accurate HR is crucial to the patients’ safety.

Download Full-text

Test-Retest Reliability of the StepWatch Activity Monitor Outputs in Healthy Adults

Journal of Physical Activity and Health ◽

10.1123/jpah.7.5.671 ◽

2010 ◽

Vol 7 (5) ◽

pp. 671-676 ◽

Cited By ~ 27

Author(s):

Suzie Mudge ◽

Denise Taylor ◽

Oliver Chang ◽

Rosita Wong

Keyword(s):

Activity Index ◽

Intraclass Correlation ◽

Objective Measure ◽

Activity Monitor ◽

Healthy Adults ◽

Peak Activity ◽

Good Test ◽

Retest Reliability ◽

Step Rate ◽

Test Retest Reliability

Background:Activity Monitors give an objective measure of usual walking performance. This study aimed to examine the test-retest reliability of the StepWatch Activity Monitor outputs (mean steps/day; peak activity index; sustained activity indices of 1, 5, 20, 30, 60 minutes; steps at high, medium, and low stepping rates).Methods:Thirty healthy adults age 18 to 49 years wore the StepWatch for 2 3-day periods at least 1 week apart.Results:The intraclass correlation coefficients of the StepWatch outputs ranged from 0.44 to 0.91 over 3 days. The coefficient of variation ranged from 3.0% to 51.3% over the monitoring periods, with higher variation shown for shorter monitoring periods. The most reliable 5 outputs had 95% limits of agreement between 3-day periods that were less than 40%. These were mean steps/day (±39.1%), highest step rate in 1 (±17.3%) and 5 (±37.4%) minutes, peak activity index (±25.6%), and percentage of inactive time (±9.52%).Conclusions:Mean steps/day, highest step rate in 1 and 5 minutes, peak activity index, and percentage of inactive time have good test-retest reliability over a 3-day monitoring period, with lower reliability shown by the other StepWatch outputs. Monitoring over 1 or 2 days is less reliable.

Download Full-text

Reliability of Subjective Pain Ratings and Nociceptive Flexion Reflex Responses as Measures of Conditioned Pain Modulation

Pain Research and Management ◽

10.1155/2014/698246 ◽

2014 ◽

Vol 19 (2) ◽

pp. 93-96 ◽

Cited By ~ 22

Author(s):

Carlo Jurth ◽

Benno Rehberg ◽

Falk von Dincklage

Keyword(s):

Intraclass Correlation ◽

Pain Modulation ◽

Hot Water ◽

Conditioned Pain Modulation ◽

Good Test ◽

Retest Reliability ◽

Reflex Responses ◽

Pain Ratings ◽

Subjective Pain ◽

Test Retest Reliability

BACKGROUND: The endogenous modulation of pain can be assessed through conditioned pain modulation (CPM), which can be quantified using subjective pain ratings or nociceptive flexion reflexes. However, to date, the test-retest reliability has only been investigated for subjective pain ratings.OBJECTIVE: To compare the test-retest reliability of CPM-induced changes, measured using subjective pain ratings and nociceptive flexion reflexes, to provide a reliable scoring parameter for future studies.METHOD: A total of 40 healthy volunteers each received painful electrical stimuli to the sural nerve to elicit nociceptive flexion reflexes. Reflex sizes and subjective pain ratings were recorded before and during the immersion of the contralateral hand in hot water to induce CPM as well as innocuous water as control. Measurements were repeated in a retest 28 days later.RESULTS: Intraclass correlation coefficients showed good test-retest reliabilities of CPM during the hot water stimulus for both scoring parameters. Subjective pain ratings also correlated between test and retest during the control stimulus.CONCLUSIONS: Subjective pain ratings and nociceptive flexion reflexes show comparable test-retest reliabilities, but they reflect different components of CPM. While subjective pain ratings appear to incorporate cognitive influences to a larger degree, reflex responses appear to reflect spinal nociception more purely.

Download Full-text

Test-Retest and Interrater Reliability of the Functional Movement Screen

Journal of Athletic Training ◽

10.4085/1062-6050-48.2.11 ◽

2013 ◽

Vol 48 (3) ◽

pp. 331-336 ◽

Cited By ~ 51

Author(s):

Rebecca Shultz ◽

Scott C. Anderson ◽

Gordon O. Matheson ◽

Brandon Marcello ◽

Thor Besier

Keyword(s):

Interrater Reliability ◽

Intraclass Correlation ◽

Video Recording ◽

Functional Movement Screen ◽

Good Test ◽

Good Reliability ◽

Functional Movement ◽

Retest Reliability ◽

Test Retest Reliability ◽

First Session

Context: The Functional Movement Screen (FMS) is a popular test to evaluate the degree of painful, dysfunctional, and asymmetric movement patterns. Despite great interest in the FMS, test-retest reliability data have not been published. Objective: To assess the test-retest and interrater reliability of the FMS and to compare the scoring by 1 rater during a live session and the same session on video. Design: Cross-sectional study. Setting: Human performance laboratory in the sports medicine center. Patients or Other Participants: A total of 21 female (age = 19.6 ± 1.5 years, height = 1.7 ± 0.1 m, mass = 64.4 ± 5.1 kg) and 18 male (age = 19.7 ± 1.0 years, height = 1.9 ± 0.1 m, mass = 80.1 ± 9.9 kg) National Collegiate Athletic Association Division IA varsity athletes volunteered. Intervention(s): Each athlete was tested and retested 1 week later by the same rater who also scored the athlete's first session from a video recording. Five other raters scored the video from the first session. Main Outcome Measure(s): The Krippendorff α (K α) was used to assess the interrater reliability, whereas intraclass correlation coefficients (ICCs) were used to assess the test-retest reliability and reliability of live-versus-video scoring. Results: Good reliability was found for the test-retest (ICC = 0.6), and excellent reliability was found for the live-versus-video sessions (ICC = 0.92). Poor reliability was found for the interrater reliability (K α = .38). Conclusions: The good test-retest and high live-versus-video session reliability show that the FMS is a usable tool within 1 rater. However, the low interrater K α values suggest that the FMS within the limits of generalization should not be used indiscriminately to detect deficiencies that place the athlete at greater risk for injury. The FMS interrater reliability may be improved with better training for the rater.

Download Full-text

Reliability of the performance-based measure of executive functions in people with schizophrenia

BMC Psychiatry ◽

10.1186/s12888-021-03562-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

En-Chi Chiu ◽

Ya-Chen Lee ◽

Shu-Chun Lee ◽

I-Ping Hsueh

Keyword(s):

Executive Functions ◽

Intraclass Correlation ◽

Minimal Detectable Change ◽

Rater Agreement ◽

Good Test ◽

Retest Reliability ◽

Purposive Action ◽

And Performance ◽

Agreement Study ◽

Test Retest Reliability

Abstract Background The Performance-based measure of Executive Functions (PEF) with four domains is designed to assess executive functions in people with schizophrenia. The purpose of this study was to examine the test-retest reliability of the PEF administered by the same rater (intra-rater agreement) and by different raters (inter-rater agreement) in people with schizophrenia and to estimate the values of minimal detectable change (MDC) and MDC%. Methods Two convenience samples (each sample, n = 60) with schizophrenia were conducted two assessments (two weeks apart). The intraclass correlation coefficient (ICC) was analyzed to examine intra-rater and inter-rater agreements of the test-retest reliability of the PEF. The MDC was calculated through standard error of measurement. Results For the intra-rater agreement study, the ICC values of the four domains were 0.88–0.92. The MDC (MDC%) of the four domains (volition, planning, purposive action, and perfromance effective) were 13.0 (13.0%), 12.2 (16.4%), 16.2 (16.2%), and 16.3 (18.8%), respectively. For the inter-rater agreement study, the ICC values of the four domains were 0.82–0.89. The MDC (MDC%) were 15.8 (15.8%), 17.4 (20.0%), 20.9 (20.9%), and 18.6 (18.6%) for the volition, planning, purposive action, and performance effective domains, respectively. Conclusions The PEF has good test-retest reliability, including intra-rater and inter-rater agreements, for people with schizophrenia. Clinicians and researchers can use the MDC values to verify whether an individual with schizophrenia shows any real change (improvement or deterioration) between repeated PEF assessments by the same or different raters.

Download Full-text

The Mandarin Voice Handicap Index for Laryngectomees With Pneumatic Artificial Laryngeal and Esophageal Speech

American Journal of Speech-Language Pathology ◽

10.1044/2021_ajslp-20-00241 ◽

2021 ◽

pp. 1-12

Author(s):

Chen-Chi Wang ◽

Jia-Shiou Liao ◽

Hsiu-Chin Lai ◽

Yi-Hsuan Lo

Keyword(s):

Internal Consistency ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Voice Handicap Index ◽

Good Test ◽

Retest Reliability ◽

Related Quality ◽

Valid Instrument ◽

Test Retest Reliability ◽

The Impact

Purpose The Voice Handicap Index (VHI) questionnaire assesses the impact of total laryngectomy on voice-related quality of life. This study evaluates the Mandarin VHI, including its internal consistency, test–retest reliability, content validity, and differences in scores for Mandarin alaryngeal patients with pneumatic artificial laryngeal (PA) and esophageal (ES) speech. Method Translation and validation of the VHI questionnaire was performed through the forward–backward translation technique. This study used a sample of 78 PA and 23 ES participants from Taiwan who completed the Mandarin VHI. Forty-two of the alaryngeal participants completed the Mandarin VHI twice over a period of 7–63 days. Results The measurement of the internal consistency of the Mandarin VHI showed a high Cronbach's alpha coefficient for the total score (.975) and the functional (.930), physical (.939), and emotional (.938) subscales. Based on the results of the intraclass correlation coefficients, good test–retest reliability for the total and domain scores was found (intraclass correlation coefficient = .827–.863). Conclusion The Mandarin VHI was validated as an instrument with proper internal consistency and reliability, which supports the Mandarin VHI as a valid instrument for the self-evaluation of handicaps related to voice problems in PA and ES speakers.

Download Full-text