Reliability of radiographic measurements of knee motion following knee arthroplasty for use in a virtual knee clinic

INTRODUCTION We sought to validate radiographic measurements of range of motion of the knee after arthroplasty as part of a new system of virtual clinics. METHODS The range of motion of 52 knees in 45 patients was obtained by 2 clinicians using standardised techniques and goniometers. Inter-rater reliability and intraclass correlation coefficients (ICCs) were calculated. Radiographs of these patients’ knees in full active flexion and extension were also used to calculate intra and inter-rater reliability compared with clinical measurements using four different methods for plotting angles on the radiographs. RESULTS The ICC for inter-rater reliability using the goniometer was very high. The ICC was 0.91 in extension and 0.85 in flexion while repeatability was 8.49° (-8.03–8.99°) in extension and 5.23° (-4.54–5.74°) in flexion. The best ICC for radiographic measurement in extension was 0.86, indicating ‘near perfect’ agreement, and repeatability was 5.43° (-4.04–6.12°). The best ICC in flexion was 0.95 and repeatability was 5.82° (-3.38–6.55°). The ICC for intrarater reliability was 0.98 for extension and 0.99 for flexion on radiographic measurements. CONCLUSIONS Validating the use of radiographs to reliably measure range of motion following knee arthroplasty has allowed us to set up a ‘virtual knee clinic’. Combining validated questionnaires and radiographic measurement of range of motion, we aim to maintain high quality patient surveillance following knee arthroplasty, reduce our ratio for new to follow-up patients in line with Department of Health guidelines and improve patient satisfaction through reduced travel to hospital outpatients.

Download Full-text

Reliability and concurrent validity of an alternative method of lateral lumbar range of motion in athletes

South African Journal of Sports Medicine ◽

10.17159/2078-516x/2016/v28i1a1414 ◽

2016 ◽

Vol 28 (1) ◽

pp. 23-26 ◽

Cited By ~ 1

Author(s):

M Hecimovich

Keyword(s):

Lumbar Spine ◽

Range Of Motion ◽

Concurrent Validity ◽

Pearson Correlation ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Index Test ◽

Low Back ◽

Detectable Change ◽

Rater Reliability

Background: Cricket bowling involves combined spinalmovements of side bending and rotation and, consequently, injuryto the low back is a common problem. Therefore the assessmentof lumbar spine kinematics has become a routine component inpreseason screening. This includes static measurement of lateralspinal flexion as asymmetrical range of motion may predispose anathlete to low back injury.Objectives: This study examined intra-rater reliability andconcurrent validity of the fingertip-to-floor distance test (FFD)when compared to a criterion range of motion measure.Methods: Thirty-four junior-level cricket players aged 13‑16years were recruited. Lumbar spine lateral flexion was measuredsimultaneously with the fingertip-to-floor distance test and digitalinclinometry methods. Relative and absolute intra-rater reliabilitywere investigated with intraclass correlation coefficients (ICC3,1)of agreement, standard error of measurement (SEM) estimates,Bland and Altman bias estimates and 95% limits of agreement,respectively. The concurrent validity of the fingertip-to-floordistance test, compared to digital inclinometry measures, wasexamined with Pearson correlation coefficients.Results: Intra-rater reliability demonstrated substantial agreementfor both measures (ICC3,1 > 0.84). The fingertip-to-floor distancetest SEM values ranged from 1.71‑2.01 cm with an estimatedminimum detectable change (MDC) threshold of 4.73‑5.55 cm.The inclinometry SEM values ranged from 1.00‑1.09° withminimal detectable change estimates of 2.77‑3.01°. There werestrong correlations between the index test and criterion measureoutcomes (r > 0.84, p < 0.001).Conclusions: This study’s results support the intra-rater reliabilityand concurrent validity of the finger-to-floor distance test,suggesting it to be a suitable surrogate measure for lumbar lateralflexion testing

Download Full-text

Intra and Inter-rater Reliability between Ultrasound Imaging and Caliper Measures to determine Spring Ligament Dimensions in Cadavers

Scientific Reports ◽

10.1038/s41598-019-51384-6 ◽

2019 ◽

Vol 9 (1) ◽

Author(s):

Fernando Santiago-Nuño ◽

Patricia Palomo-López ◽

Ricardo Becerro-de-Bengoa-Vallejo ◽

César Calvo-Lobo ◽

Marta Elena Losa-Iglesias ◽

...

Keyword(s):

Ultrasound Imaging ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Absolute Accuracy ◽

Strong Correlations ◽

Perfect Agreement ◽

Rater Reliability ◽

Intraclass Correlation Coefficients ◽

Spring Ligament ◽

Good Repeatability

Abstract The purpose was to evaluate intra and inter-rater reliability, repeatability and absolute accuracy between ultrasound imaging (US) and caliper measures to determine Spring ligament (SL) dimensions in cadavers. SLs were identified from 62 human feet from formaldehyde-embalmed cadavers. Intra and inter-observer reliability, repeatability and absolute accuracy of SL width, thickness and length between US and caliper measurements were determined at intra and inter-session by intraclass correlation coefficients, Pearson´s correlation coefficients, Student t tests, standard errors of measurement, minimum detectable changes, values of normality, 95% limits of agreement, and Bland-Altman plots. Excellent inter-session and inter-rater reliability, adequate absolute accuracy, almost perfect agreement and strong correlations were shown for caliper, US and their comparison for all SL dimensions. US measurements presented higher absolute accuracy than caliper measures for SL length and thickness dimensions, while caliper displayed greater absolute accuracy for SL width dimensions. Good repeatability (P > 0.05) was shown for all SL dimensions by US, caliper and their comparison, except for SL width dimension measured with US (P = 0.019). Both US and caliper could be recommended for all SL dimensions evaluation due to their excellent reliability and absolute accuracy in cadavers, although width dimensions should be considered with caution due to US repeatability differences.

Download Full-text

Adaptation and Validation of a Nutrition Environment Measures Survey for University Grab-and-Go Establishments

Canadian Journal of Dietetic Practice and Research ◽

10.3148/cjdpr-2015-036 ◽

2016 ◽

Vol 77 (1) ◽

pp. 17-24 ◽

Cited By ~ 6

Author(s):

Brian K.C. Lo ◽

Leia Minaker ◽

Alicia N.T. Chan ◽

Jessica Hrgetic ◽

Catherine L. Mah

Keyword(s):

Construct Validity ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Comparison Method ◽

Nutrition Information ◽

Perfect Agreement ◽

Cross Sectional ◽

Rater Reliability ◽

University Of Toronto ◽

Nutrition Environment

Purpose: To adapt and validate a survey instrument to assess the nutrition environment of grab-and-go establishments at a university campus. Methods: A version of the Nutrition Environment Measures Survey for grab-and-go establishments (NEMS-GG) was adapted from existing NEMS instruments and tested for reliability and validity through a cross-sectional assessment of the grab-and-go establishments at the University of Toronto. Product availability, price, and presence of nutrition information were evaluated. Cohen’s kappa coefficient and intra-class correlation coefficients (ICC) were assessed for inter-rater reliability, and construct validity was assessed using the known-groups comparison method (via store scores). Results: Fifteen grab-and-go establishments were assessed. Inter-rater reliability was high with an almost perfect agreement for availability (mean κ = 0.995) and store scores (ICC = 0.999). The tool demonstrated good face and construct validity. About half of the venues carried fruit and vegetables (46.7% and 53.3%, respectively). Regular and healthier entrée items were generally the same price. Healthier grains were cheaper than regular options. Six establishments displayed nutrition information. Establishments operated by the university’s Food Services consistently scored the highest across all food premise types for nutrition signage, availability, and cost of healthier options. Conclusions: Health promotion strategies are needed to address availability and variety of healthier grab-and-go options in university settings.

Download Full-text

Validity and reliability of a wearable-controlled serious game and goniometer for telemonitoring of wrist fracture rehabilitation

European Journal of Trauma and Emergency Surgery ◽

10.1007/s00068-021-01657-5 ◽

2021 ◽

Author(s):

Henriëtte A. W. Meijer ◽

Maurits Graafland ◽

Miryam C. Obdeijn ◽

Marlies P. Schijven ◽

J. Carel Goslings

Keyword(s):

Range Of Motion ◽

Intraclass Correlation ◽

Internal Validity ◽

Wrist Fracture ◽

Serious Game ◽

Reference Standard ◽

Validity And Reliability ◽

Good Reliability ◽

Rater Reliability ◽

Flexion Extension

Abstract Purpose To determine the validity of wrist range of motion (ROM) measurements by the wearable-controlled ReValidate! wrist-rehabilitation game, which simultaneously acts as a digital goniometer. Furthermore, to establish the reliability of the game by contrasting ROM measurements to those found by medical experts using a universal goniometer. Methods As the universal goniometer is considered the reference standard, inter-rater reliability between surgeons was first determined. Internal validity of the game ROM measurements was determined in a test–retest setting with healthy volunteers. The reliability of the game was tested in 34 patients with a restricted range of motion, in whom the ROM was measured by experts as well as digitally. Intraclass-correlation coefficients (ICCs) were determined and outcomes were analyzed using Bland–Altman plots. Results Inter-rater reliability between experts using a universal goniometer was poor, with ICCs of 0.002, 0.160 and 0.520. Internal validity testing of the game found ICCs of − 0.693, 0.376 and 0.863, thus ranging from poor to good. Reliability testing of the game compared to medical expert measurements, found that mean differences were small for the flexion–extension arc and the radial deviation-ulnar deviation arc. Conclusion The ReValidate! game is a reliable home-monitoring device digitally measuring ROM in the wrist. Interestingly, the test–retest reliability of the serious game was found to be considerably higher than the inter-rater reliability of the reference standard, being healthcare professionals using a universal goniometer. Trial registration number (internal hospital registration only) MEC-AMC W17_003 #17.015.

Download Full-text

Adaptation and Psychometric Evaluation of the Chinese Counseling Competencies Scale-Revised

Frontiers in Psychology ◽

10.3389/fpsyg.2021.688539 ◽

2021 ◽

Vol 12 ◽

Author(s):

Wei Xia ◽

William Ho Cheung Li ◽

Tingna Liang ◽

Yuanhui Luo ◽

Laurie Long Kwan Ho ◽

...

Keyword(s):

Concurrent Validity ◽

Convergent Validity ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Psychometric Evaluation ◽

Counseling Competencies ◽

Counselors In Training ◽

Rater Reliability ◽

Retest Reliability ◽

Test Retest Reliability

Objectives: This study conducted a linguistic and psychometric evaluation of the Chinese Counseling Competencies Scale-Revised (CCS-R).Methods: The Chinese CCS-R was created from the original English version using a standard forward-backward translation process. The psychometric properties of the Chinese CCS-R were examined in a cohort of 208 counselors-in-training by two independent raters. Fifty-three counselors-in-training were asked to undergo another counseling performance evaluation for the test-retest. The confirmatory factor analysis (CFA) was conducted for the Chinese CCS-R, followed by internal consistency, test-retest reliability, inter-rater reliability, convergent validity, and concurrent validity.Results: The results of the CFA supported the factorial validity of the Chinese CCS-R, with adequate construct replicability. The scale had a McDonald's omega of 0.876, and intraclass correlation coefficients of 0.63 and 0.90 for test-retest reliability and inter-rater reliability, respectively. Significantly positive correlations were observed between the Chinese CCS-R score and scores of performance checklist (Pearson's γ = 0.781), indicating a large convergent validity, and knowledge on drug abuse (Pearson's γ = 0.833), indicating a moderate concurrent validity.Conclusion: The results support that the Chinese CCS-R is a valid and reliable measure of the counseling competencies.Practice implication: The CCS-R provides trainers with a reliable tool to evaluate counseling students' competencies and to facilitate discussions with trainees about their areas for growth.

Download Full-text

Validity and reliability of inertial sensors for elbow and wrist range of motion assessment

PeerJ ◽

10.7717/peerj.9687 ◽

2020 ◽

Vol 8 ◽

pp. e9687

Author(s):

Vanina Costa ◽

Óscar Ramírez ◽

Abraham Otero ◽

Daniel Muñoz-García ◽

Sandra Uribarri ◽

...

Keyword(s):

Range Of Motion ◽

Inertial Sensors ◽

High Reliability ◽

Intraclass Correlation ◽

Active Movement ◽

Neutral Position ◽

Maximum Difference ◽

Validity And Reliability ◽

Rater Reliability ◽

Flexion Extension

Background Elbow and wrist chronic conditions are very common among musculoskeletal problems. These painful conditions affect muscle function, which ultimately leads to a decrease in the joint’s Range Of Motion (ROM). Due to their portability and ease of use, goniometers are still the most widespread tool for measuring ROM. Inertial sensors are emerging as a digital, low-cost and accurate alternative. However, whereas inertial sensors are commonly used in research studies, due to the lack of information about their validity and reliability, they are not widely used in the clinical practice. The goal of this study is to assess the validity and intra-inter-rater reliability of inertial sensors for measuring active ROM of the elbow and wrist. Materials and Methods Measures were taken simultaneously with inertial sensors (Werium™ system) and a universal goniometer. The process involved two physiotherapists (“rater A” and “rater B”) and an engineer responsible for the technical issues. Twenty-nine asymptomatic subjects were assessed individually in two sessions separated by 48 h. The procedure was repeated by rater A followed by rater B with random order. Three repetitions of each active movement (elbow flexion, pronation, and supination; and wrist flexion, extension, radial deviation and ulnar deviation) were executed starting from the neutral position until the ROM end-feel; that is, until ROM reached its maximum due to be stopped by the anatomy. The coefficient of determination (r2) and the Intraclass Correlation Coefficient (ICC) were calculated to assess the intra-rater and inter-rater reliability. The Standard Error of the Measurement and the Minimum Detectable Change and a Bland–Altman plots were also calculated. Results Similar ROM values when measured with both instruments were obtained for the elbow (maximum difference of 3° for all the movements) and wrist (maximum difference of 1° for all the movements). These values were within the normal range when compared to literature studies. The concurrent validity analysis for all the movements yielded ICC values ≥0.78 for the elbow and ≥0.95 for the wrist. Concerning reliability, the ICC values denoted a high reliability of inertial sensors for all the different movements. In the case of the elbow, intra-rater and inter-rater reliability ICC values range from 0.83 to 0.96 and from 0.94 to 0.97, respectively. Intra-rater analysis of the wrist yielded ICC values between 0.81 and 0.93, while the ICC values for the inter-rater analysis range from 0.93 to 0.99. Conclusions Inertial sensors are a valid and reliable tool for measuring elbow and wrist active ROM. Particularly noteworthy is their high inter-rater reliability, often questioned in measurement tools. The lowest reliability is observed in elbow prono-supination, probably due to skin artifacts. Based on these results and their advantages, inertial sensors can be considered a valid assessment tool for wrist and elbow ROM.

Download Full-text

Is the location of the signal intensity weighted centroid a reliable measurement of fluid displacement within the disc?

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2016-0178 ◽

2018 ◽

Vol 63 (4) ◽

pp. 453-460 ◽

Cited By ~ 7

Author(s):

Vahid Abdollah ◽

Eric C. Parent ◽

Michele C. Battié

Keyword(s):

Signal Intensity ◽

Water Distribution ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Region Of Interest ◽

Rater Reliability ◽

Fluid Displacement ◽

Intraclass Correlation Coefficients ◽

The Mean ◽

Standard Error Of Measurement

Abstract Degenerated discs have shorter T2-relaxation time and lower MR signal. The location of the signal-intensity-weighted-centroid reflects the water distribution within a region-of-interest (ROI). This study compared the reliability of the location of the signal-intensity-weighted-centroid to mean signal intensity and area measurements. L4-L5 and L5-S1 discs were measured on 43 mid-sagittal T2-weighted 3T MRI images in adults with back pain. One rater analysed images twice and another once, blinded to measurements. Discs were semi-automatically segmented into a whole disc, nucleus, anterior and posterior annulus. The coordinates of the signal-intensity-weighted-centroid for all regions demonstrated excellent intraclass-correlation-coefficients for intra- (0.99–1.00) and inter-rater reliability (0.97–1.00). The standard error of measurement for the Y-coordinates of the signal-intensity-weighted-centroid for all ROIs were 0 at both levels and 0 to 2.7 mm for X-coordinates. The mean signal intensity and area for the whole disc and nucleus presented excellent intra-rater reliability with intraclass-correlation-coefficients from 0.93 to 1.00, and 0.92 to 1.00 for inter-rater reliability. The mean signal intensity and area had lower reliability for annulus ROIs, with intra-rater intraclass-correlation-coefficient from 0.5 to 0.76 and inter-rater from 0.33 to 0.58. The location of the signal-intensity-weighted-centroid is a reliable biomarker for investigating the effects of disc interventions.

Download Full-text

Development and assessment of the inter-rater and intra-rater reproducibility of a self-administration version of the ALSFRS-R

Journal of Neurology Neurosurgery & Psychiatry ◽

10.1136/jnnp-2019-321138 ◽

2019 ◽

Vol 91 (1) ◽

pp. 75-81 ◽

Cited By ~ 7

Author(s):

Leonhard A Bakker ◽

Carin D Schröder ◽

Harold H G Tan ◽

Simone M A G Vugts ◽

Ruben P A van Eijk ◽

...

Keyword(s):

Rating Scale ◽

Clinical Care ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

The Self ◽

Coefficient Alpha ◽

Rater Agreement ◽

Self Administration ◽

Limits Of Agreement ◽

Rater Reliability

ObjectiveThe Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) is widely applied to assess disease severity and progression in patients with motor neuron disease (MND). The objective of the study is to assess the inter-rater and intra-rater reproducibility, i.e., the inter-rater and intra-rater reliability and agreement, of a self-administration version of the ALSFRS-R for use in apps, online platforms, clinical care and trials.MethodsThe self-administration version of the ALSFRS-R was developed based on both patient and expert feedback. To assess the inter-rater reproducibility, 59 patients with MND filled out the ALSFRS-R online and were subsequently assessed on the ALSFRS-R by three raters. To assess the intra-rater reproducibility, patients were invited on two occasions to complete the ALSFRS-R online. Reliability was assessed with intraclass correlation coefficients, agreement was assessed with Bland-Altman plots and paired samples t-tests, and internal consistency was examined with Cronbach’s coefficient alpha.ResultsThe self-administration version of the ALSFRS-R demonstrated excellent inter-rater and intra-rater reliability. The assessment of inter-rater agreement demonstrated small systematic differences between patients and raters and acceptable limits of agreement. The assessment of intra-rater agreement demonstrated no systematic changes between time points; limits of agreement were 4.3 points for the total score and ranged from 1.6 to 2.4 points for the domain scores. Coefficient alpha values were acceptable.DiscussionThe self-administration version of the ALSFRS-R demonstrates high reproducibility and can be used in apps and online portals for both individual comparisons, facilitating the management of clinical care and group comparisons in clinical trials.

Download Full-text

A Comparison of Reliability Coefficients for Ordinal Rating Scales

Journal of Classification ◽

10.1007/s00357-021-09386-5 ◽

2021 ◽

Author(s):

Alexandra de Raadt ◽

Matthijs J. Warrens ◽

Roel J. Bosker ◽

Henk A. L. Kiers

Keyword(s):

Empirical Data ◽

Rating Scales ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Rater Reliability ◽

Intraclass Correlations ◽

Applied Researcher ◽

Highly Correlated ◽

Reliability Coefficients

AbstractKappa coefficients are commonly used for quantifying reliability on a categorical scale, whereas correlation coefficients are commonly applied to assess reliability on an interval scale. Both types of coefficients can be used to assess the reliability of ordinal rating scales. In this study, we compare seven reliability coefficients for ordinal rating scales: the kappa coefficients included are Cohen’s kappa, linearly weighted kappa, and quadratically weighted kappa; the correlation coefficients included are intraclass correlation ICC(3,1), Pearson’s correlation, Spearman’s rho, and Kendall’s tau-b. The primary goal is to provide a thorough understanding of these coefficients such that the applied researcher can make a sensible choice for ordinal rating scales. A second aim is to find out whether the choice of the coefficient matters. We studied to what extent we reach the same conclusions about inter-rater reliability with different coefficients, and to what extent the coefficients measure agreement in a similar way, using analytic methods, and simulated and empirical data. Using analytical methods, it is shown that differences between quadratic kappa and the Pearson and intraclass correlations increase if agreement becomes larger. Differences between the three coefficients are generally small if differences between rater means and variances are small. Furthermore, using simulated and empirical data, it is shown that differences between all reliability coefficients tend to increase if agreement between the raters increases. Moreover, for the data in this study, the same conclusion about inter-rater reliability was reached in virtually all cases with the four correlation coefficients. In addition, using quadratically weighted kappa, we reached a similar conclusion as with any correlation coefficient a great number of times. Hence, for the data in this study, it does not really matter which of these five coefficients is used. Moreover, the four correlation coefficients and quadratically weighted kappa tend to measure agreement in a similar way: their values are very highly correlated for the data in this study.

Download Full-text

The Brazilian version of the Neuropsychiatric Inventory-Clinician rating scale (NPI-C): reliability and validity in dementia

International Psychogeriatrics ◽

10.1017/s1041610213000811 ◽

2013 ◽

Vol 25 (9) ◽

pp. 1503-1511 ◽

Cited By ~ 16

Author(s):

Florindo Stella ◽

Orestes Vicente Forlenza ◽

Jerson Laks ◽

Larissa Pires de Andrade ◽

Michelle A. Ljubetic Avendaño ◽

...

Keyword(s):

Concurrent Validity ◽

Convergent Validity ◽

Rating Scale ◽

Neuropsychiatric Symptoms ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Neuropsychiatric Inventory ◽

Rater Reliability ◽

Clinician Rating

ABSTRACTBackground:Patients with dementia may be unable to describe their symptoms, and caregivers frequently suffer emotional burden that can interfere with judgment of the patient's behavior. The Neuropsychiatric Inventory-Clinician rating scale (NPI-C) was therefore developed as a comprehensive and versatile instrument to assess and accurately measure neuropsychiatric symptoms (NPS) in dementia, thereby using information from caregiver and patient interviews, and any other relevant available data. The present study is a follow-up to the original, cross-national NPI-C validation, evaluating the reliability and concurrent validity of the NPI-C in quantifying psychopathological symptoms in dementia in a large Brazilian cohort.Methods:Two blinded raters evaluated 312 participants (156 patient-knowledgeable informant dyads) using the NPI-C for a total of 624 observations in five Brazilian centers. Inter-rater reliability was determined through intraclass correlation coefficients for the NPI-C domains and the traditional NPI. Convergent validity included correlations of specific domains of the NPI-C with the Brief Psychiatric Rating Scale (BPRS), the Cohen-Mansfield Agitation Index (CMAI), the Cornell Scale for Depression in Dementia (CSDD), and the Apathy Inventory (AI).Results:Inter-rater reliability was strong for all NPI-C domains. There were high correlations between NPI-C/delusions and BPRS, NPI-C/apathy-indifference with the AI, NPI-C/depression-dysphoria with the CSDD, NPI-C/agitation with the CMAI, and NPI-C/aggression with the CMAI. There was moderate correlation between the NPI-C/aberrant vocalizations and CMAI and the NPI-C/hallucinations with the BPRS.Conclusion:The NPI-C is a comprehensive tool that provides accurate measurement of NPS in dementia with high concurrent validity and inter-rater reliability in the Brazilian setting. In addition to universal assessment, the NPI-C can be completed by individual domains.

Download Full-text