Intra-rater reliability of lower leg subcutaneous skin fold thickness derived from hand-held a-mode ultrasound

BACKGROUND: Ultrasound is an important tool to diagnose many clinical conditions. Yet hand-held devices may be prone to more data variability in part from the greater likelihood of human error. OBJECTIVE: Quantify intra-rater reliability of subcutaneous skin fold thickness from a hand-held ultrasound device. PARTICIPANTS: College-age subjects (18 men, 14 women) submitted to two sets of ultrasound subcutaneous skin fold measurements spaced (mean + sem) 10.6 + 2.2 days apart. Per measurement, they stood relaxed as ultrasound measured the subcutaneous skin fold thickness that covered their left leg’s calf muscle group. Measurements occurred with a hand-held device (BodyMetrix Pro System BX2000; Livermore, CA) used in accordance with the manufacturer’s guidelines. Four subcutaneous measurements were made 90∘ apart (anterior, medial, posterior, lateral) at the portion of the left calf with the largest circumference. To assess intra-rater reliability, we used intraclass correlation coefficients, limits of agreement, coefficient of variation and the smallest real difference. RESULTS: Intra-rater reliability was high for most of our statistical tests. CONCLUSION: Despite the relatively long period between measurements, our hand-held ultrasound device exhibited a high degree of intra-rater reliability. Given our results, ultrasound measurements may be a useful tool to quantify skin fold thickness.

Download Full-text

Adaptation and Psychometric Evaluation of the Chinese Counseling Competencies Scale-Revised

Frontiers in Psychology ◽

10.3389/fpsyg.2021.688539 ◽

2021 ◽

Vol 12 ◽

Author(s):

Wei Xia ◽

William Ho Cheung Li ◽

Tingna Liang ◽

Yuanhui Luo ◽

Laurie Long Kwan Ho ◽

...

Keyword(s):

Concurrent Validity ◽

Convergent Validity ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Psychometric Evaluation ◽

Counseling Competencies ◽

Counselors In Training ◽

Rater Reliability ◽

Retest Reliability ◽

Test Retest Reliability

Objectives: This study conducted a linguistic and psychometric evaluation of the Chinese Counseling Competencies Scale-Revised (CCS-R).Methods: The Chinese CCS-R was created from the original English version using a standard forward-backward translation process. The psychometric properties of the Chinese CCS-R were examined in a cohort of 208 counselors-in-training by two independent raters. Fifty-three counselors-in-training were asked to undergo another counseling performance evaluation for the test-retest. The confirmatory factor analysis (CFA) was conducted for the Chinese CCS-R, followed by internal consistency, test-retest reliability, inter-rater reliability, convergent validity, and concurrent validity.Results: The results of the CFA supported the factorial validity of the Chinese CCS-R, with adequate construct replicability. The scale had a McDonald's omega of 0.876, and intraclass correlation coefficients of 0.63 and 0.90 for test-retest reliability and inter-rater reliability, respectively. Significantly positive correlations were observed between the Chinese CCS-R score and scores of performance checklist (Pearson's γ = 0.781), indicating a large convergent validity, and knowledge on drug abuse (Pearson's γ = 0.833), indicating a moderate concurrent validity.Conclusion: The results support that the Chinese CCS-R is a valid and reliable measure of the counseling competencies.Practice implication: The CCS-R provides trainers with a reliable tool to evaluate counseling students' competencies and to facilitate discussions with trainees about their areas for growth.

Download Full-text

Is the location of the signal intensity weighted centroid a reliable measurement of fluid displacement within the disc?

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2016-0178 ◽

2018 ◽

Vol 63 (4) ◽

pp. 453-460 ◽

Cited By ~ 7

Author(s):

Vahid Abdollah ◽

Eric C. Parent ◽

Michele C. Battié

Keyword(s):

Signal Intensity ◽

Water Distribution ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Region Of Interest ◽

Rater Reliability ◽

Fluid Displacement ◽

Intraclass Correlation Coefficients ◽

The Mean ◽

Standard Error Of Measurement

Abstract Degenerated discs have shorter T2-relaxation time and lower MR signal. The location of the signal-intensity-weighted-centroid reflects the water distribution within a region-of-interest (ROI). This study compared the reliability of the location of the signal-intensity-weighted-centroid to mean signal intensity and area measurements. L4-L5 and L5-S1 discs were measured on 43 mid-sagittal T2-weighted 3T MRI images in adults with back pain. One rater analysed images twice and another once, blinded to measurements. Discs were semi-automatically segmented into a whole disc, nucleus, anterior and posterior annulus. The coordinates of the signal-intensity-weighted-centroid for all regions demonstrated excellent intraclass-correlation-coefficients for intra- (0.99–1.00) and inter-rater reliability (0.97–1.00). The standard error of measurement for the Y-coordinates of the signal-intensity-weighted-centroid for all ROIs were 0 at both levels and 0 to 2.7 mm for X-coordinates. The mean signal intensity and area for the whole disc and nucleus presented excellent intra-rater reliability with intraclass-correlation-coefficients from 0.93 to 1.00, and 0.92 to 1.00 for inter-rater reliability. The mean signal intensity and area had lower reliability for annulus ROIs, with intra-rater intraclass-correlation-coefficient from 0.5 to 0.76 and inter-rater from 0.33 to 0.58. The location of the signal-intensity-weighted-centroid is a reliable biomarker for investigating the effects of disc interventions.

Download Full-text

Development and assessment of the inter-rater and intra-rater reproducibility of a self-administration version of the ALSFRS-R

Journal of Neurology Neurosurgery & Psychiatry ◽

10.1136/jnnp-2019-321138 ◽

2019 ◽

Vol 91 (1) ◽

pp. 75-81 ◽

Cited By ~ 7

Author(s):

Leonhard A Bakker ◽

Carin D Schröder ◽

Harold H G Tan ◽

Simone M A G Vugts ◽

Ruben P A van Eijk ◽

...

Keyword(s):

Rating Scale ◽

Clinical Care ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

The Self ◽

Coefficient Alpha ◽

Rater Agreement ◽

Self Administration ◽

Limits Of Agreement ◽

Rater Reliability

ObjectiveThe Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) is widely applied to assess disease severity and progression in patients with motor neuron disease (MND). The objective of the study is to assess the inter-rater and intra-rater reproducibility, i.e., the inter-rater and intra-rater reliability and agreement, of a self-administration version of the ALSFRS-R for use in apps, online platforms, clinical care and trials.MethodsThe self-administration version of the ALSFRS-R was developed based on both patient and expert feedback. To assess the inter-rater reproducibility, 59 patients with MND filled out the ALSFRS-R online and were subsequently assessed on the ALSFRS-R by three raters. To assess the intra-rater reproducibility, patients were invited on two occasions to complete the ALSFRS-R online. Reliability was assessed with intraclass correlation coefficients, agreement was assessed with Bland-Altman plots and paired samples t-tests, and internal consistency was examined with Cronbach’s coefficient alpha.ResultsThe self-administration version of the ALSFRS-R demonstrated excellent inter-rater and intra-rater reliability. The assessment of inter-rater agreement demonstrated small systematic differences between patients and raters and acceptable limits of agreement. The assessment of intra-rater agreement demonstrated no systematic changes between time points; limits of agreement were 4.3 points for the total score and ranged from 1.6 to 2.4 points for the domain scores. Coefficient alpha values were acceptable.DiscussionThe self-administration version of the ALSFRS-R demonstrates high reproducibility and can be used in apps and online portals for both individual comparisons, facilitating the management of clinical care and group comparisons in clinical trials.

Download Full-text

A Comparison of Reliability Coefficients for Ordinal Rating Scales

Journal of Classification ◽

10.1007/s00357-021-09386-5 ◽

2021 ◽

Author(s):

Alexandra de Raadt ◽

Matthijs J. Warrens ◽

Roel J. Bosker ◽

Henk A. L. Kiers

Keyword(s):

Empirical Data ◽

Rating Scales ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Rater Reliability ◽

Intraclass Correlations ◽

Applied Researcher ◽

Highly Correlated ◽

Reliability Coefficients

AbstractKappa coefficients are commonly used for quantifying reliability on a categorical scale, whereas correlation coefficients are commonly applied to assess reliability on an interval scale. Both types of coefficients can be used to assess the reliability of ordinal rating scales. In this study, we compare seven reliability coefficients for ordinal rating scales: the kappa coefficients included are Cohen’s kappa, linearly weighted kappa, and quadratically weighted kappa; the correlation coefficients included are intraclass correlation ICC(3,1), Pearson’s correlation, Spearman’s rho, and Kendall’s tau-b. The primary goal is to provide a thorough understanding of these coefficients such that the applied researcher can make a sensible choice for ordinal rating scales. A second aim is to find out whether the choice of the coefficient matters. We studied to what extent we reach the same conclusions about inter-rater reliability with different coefficients, and to what extent the coefficients measure agreement in a similar way, using analytic methods, and simulated and empirical data. Using analytical methods, it is shown that differences between quadratic kappa and the Pearson and intraclass correlations increase if agreement becomes larger. Differences between the three coefficients are generally small if differences between rater means and variances are small. Furthermore, using simulated and empirical data, it is shown that differences between all reliability coefficients tend to increase if agreement between the raters increases. Moreover, for the data in this study, the same conclusion about inter-rater reliability was reached in virtually all cases with the four correlation coefficients. In addition, using quadratically weighted kappa, we reached a similar conclusion as with any correlation coefficient a great number of times. Hence, for the data in this study, it does not really matter which of these five coefficients is used. Moreover, the four correlation coefficients and quadratically weighted kappa tend to measure agreement in a similar way: their values are very highly correlated for the data in this study.

Download Full-text

The Brazilian version of the Neuropsychiatric Inventory-Clinician rating scale (NPI-C): reliability and validity in dementia

International Psychogeriatrics ◽

10.1017/s1041610213000811 ◽

2013 ◽

Vol 25 (9) ◽

pp. 1503-1511 ◽

Cited By ~ 16

Author(s):

Florindo Stella ◽

Orestes Vicente Forlenza ◽

Jerson Laks ◽

Larissa Pires de Andrade ◽

Michelle A. Ljubetic Avendaño ◽

...

Keyword(s):

Concurrent Validity ◽

Convergent Validity ◽

Rating Scale ◽

Neuropsychiatric Symptoms ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Neuropsychiatric Inventory ◽

Rater Reliability ◽

Clinician Rating

ABSTRACTBackground:Patients with dementia may be unable to describe their symptoms, and caregivers frequently suffer emotional burden that can interfere with judgment of the patient's behavior. The Neuropsychiatric Inventory-Clinician rating scale (NPI-C) was therefore developed as a comprehensive and versatile instrument to assess and accurately measure neuropsychiatric symptoms (NPS) in dementia, thereby using information from caregiver and patient interviews, and any other relevant available data. The present study is a follow-up to the original, cross-national NPI-C validation, evaluating the reliability and concurrent validity of the NPI-C in quantifying psychopathological symptoms in dementia in a large Brazilian cohort.Methods:Two blinded raters evaluated 312 participants (156 patient-knowledgeable informant dyads) using the NPI-C for a total of 624 observations in five Brazilian centers. Inter-rater reliability was determined through intraclass correlation coefficients for the NPI-C domains and the traditional NPI. Convergent validity included correlations of specific domains of the NPI-C with the Brief Psychiatric Rating Scale (BPRS), the Cohen-Mansfield Agitation Index (CMAI), the Cornell Scale for Depression in Dementia (CSDD), and the Apathy Inventory (AI).Results:Inter-rater reliability was strong for all NPI-C domains. There were high correlations between NPI-C/delusions and BPRS, NPI-C/apathy-indifference with the AI, NPI-C/depression-dysphoria with the CSDD, NPI-C/agitation with the CMAI, and NPI-C/aggression with the CMAI. There was moderate correlation between the NPI-C/aberrant vocalizations and CMAI and the NPI-C/hallucinations with the BPRS.Conclusion:The NPI-C is a comprehensive tool that provides accurate measurement of NPS in dementia with high concurrent validity and inter-rater reliability in the Brazilian setting. In addition to universal assessment, the NPI-C can be completed by individual domains.

Download Full-text

Reliability and concurrent validity of an alternative method of lateral lumbar range of motion in athletes

South African Journal of Sports Medicine ◽

10.17159/2078-516x/2016/v28i1a1414 ◽

2016 ◽

Vol 28 (1) ◽

pp. 23-26 ◽

Cited By ~ 1

Author(s):

M Hecimovich

Keyword(s):

Lumbar Spine ◽

Range Of Motion ◽

Concurrent Validity ◽

Pearson Correlation ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Index Test ◽

Low Back ◽

Detectable Change ◽

Rater Reliability

Background: Cricket bowling involves combined spinalmovements of side bending and rotation and, consequently, injuryto the low back is a common problem. Therefore the assessmentof lumbar spine kinematics has become a routine component inpreseason screening. This includes static measurement of lateralspinal flexion as asymmetrical range of motion may predispose anathlete to low back injury.Objectives: This study examined intra-rater reliability andconcurrent validity of the fingertip-to-floor distance test (FFD)when compared to a criterion range of motion measure.Methods: Thirty-four junior-level cricket players aged 13‑16years were recruited. Lumbar spine lateral flexion was measuredsimultaneously with the fingertip-to-floor distance test and digitalinclinometry methods. Relative and absolute intra-rater reliabilitywere investigated with intraclass correlation coefficients (ICC3,1)of agreement, standard error of measurement (SEM) estimates,Bland and Altman bias estimates and 95% limits of agreement,respectively. The concurrent validity of the fingertip-to-floordistance test, compared to digital inclinometry measures, wasexamined with Pearson correlation coefficients.Results: Intra-rater reliability demonstrated substantial agreementfor both measures (ICC3,1 > 0.84). The fingertip-to-floor distancetest SEM values ranged from 1.71‑2.01 cm with an estimatedminimum detectable change (MDC) threshold of 4.73‑5.55 cm.The inclinometry SEM values ranged from 1.00‑1.09° withminimal detectable change estimates of 2.77‑3.01°. There werestrong correlations between the index test and criterion measureoutcomes (r > 0.84, p < 0.001).Conclusions: This study’s results support the intra-rater reliabilityand concurrent validity of the finger-to-floor distance test,suggesting it to be a suitable surrogate measure for lumbar lateralflexion testing

Download Full-text

Genetic effect on apgar score

Genetics and Molecular Biology ◽

10.1590/s1415-47571999000100003 ◽

1999 ◽

Vol 22 (1) ◽

pp. 13-16 ◽

Cited By ~ 3

Author(s):

Carla Franchi-Pinto ◽

Glória Maria Duccini Dal Colletto ◽

Henrique Krieger ◽

Bernardo Beiguelman

Keyword(s):

Apgar Score ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Genetic Effect ◽

Lower Proportion ◽

Apgar Scores ◽

Intraclass Correlation Coefficients ◽

Clinical Conditions ◽

Age And Sex ◽

Genetic Hypothesis

Intraclass correlation coefficients for one- and five-min Apgar scores of 604 twin pairs born at a southeastern Brazilian hospital were calculated, after adjusting these scores for gestational age and sex. The data support a genetic hypothesis only for 1-min Apgar score, probably because it is less affected by the environment than 4 min later, after the newborns have been under the care of a neonatology team. First-born twins exhibited, on average, better clinical conditions than second-born twins. The former showed a significantly lower proportion of Apgar scores under seven than second-born twins, both at 1 min (17.5% vs. 29.8%) and at 5 min (7.2% vs. 11.9%). The proportion of children born with "good" Apgar scores was significantly smaller among twins than among 1,522 singletons born at the same hospital. Among the latter, 1- and 5-min Apgar scores under seven were exhibited by 9.2% and 3.4% newborns, respectively.

Download Full-text

Validation of the pediatric Radboud dysarthria assessment

Journal of Pediatric Rehabilitation Medicine ◽

10.3233/prm-190671 ◽

2021 ◽

pp. 1-12

Author(s):

Marieke Ruessink ◽

Lenie van den Engel-Hoek ◽

Marjo van Gerven ◽

Bea Spek ◽

Bert de Swart ◽

...

Keyword(s):

Correlation Coefficient ◽

Intraclass Correlation ◽

Self Care ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Activity Level ◽

Clinical Use ◽

Spearman Correlation ◽

Rater Reliability ◽

Video Recordings

PURPOSE: The Radboud Dysarthria Assessment (RDA) was published in 2014. Adaptation into a pediatric version (p-RDA) was required because of relevant differences between children and adults. The purpose of this study was to assess the feasibility of the p-RDA and to test intra-rater and inter-rater reliability as well as the validity of the two severity scales (function and activity level). METHODS: Video recordings were made of 35 participants with (suspected) dysarthria (age 4 to 17 years) while being assessed using the p-RDA. Intra-rater reliability was assessed by one, and inter-rater reliability by two experiments using the Intraclass Correlation Coefficient (ICC). Validity of the severity scales was tested by correlating the consensus scores with the independently rated scores on four communication scales, three mobility scales, and one self-care scale using Spearman correlation coefficients (r s). RESULTS: The assessment was applicable for 89% of the tested sample, with good intra-rater and inter-rater reliability (ICC = 0.88–0.98 and 0.83–0.93). The p-RDA severity scales (function and activity level) correlated from substantially to strongly with the communication scales (r s = 0.69–0.82 and 0.77–0.92) and self-care scale (r s = 0.76–0.71) and correlated substantially with the mobility scales (r s = 0.49–0.60). CONCLUSION: The feasibility, reliability and validity of the p-RDA are sufficient for clinical use.

Download Full-text

Reliability and validity of the iSense optical scanner for measuring volume of transtibial residual limb models

Prosthetics and Orthotics International ◽

10.1177/0309364618806038 ◽

2018 ◽

Vol 43 (2) ◽

pp. 213-220 ◽

Cited By ~ 1

Author(s):

Lucy Armitage ◽

Li Khim Kwah ◽

Lauren Kark

Keyword(s):

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Criterion Validity ◽

Residual Limb ◽

Limb Volume ◽

Rater Reliability ◽

Measuring Volume ◽

Intraclass Correlation Coefficients ◽

Optical Scanner

Background: Residual limb volume is often measured as part of routine care for people with amputations. These measurements assist in the timing of prosthetic fitting or replacement. In order to make well informed decisions, clinicians need access to measurement tools that are valid and reliable. Objectives: To assess the reliability and criterion validity of the iSense optical scanner in measuring volume of transtibial residual limb models. Study Design: Three assessors performed two measurements each on 13 residual limb models with an iSense optical scanner (3D systems, USA). Intra-rater and inter-rater reliability were calculated using intraclass correlation coefficients. Bland Altman plots were inspected for agreement. Criterion validity was assessed using a steel rod of known dimensions. Ten repeated measurements were performed by one assessor. A t-test was used to determine differences between measured and true rod volume. Results: Intra-rater reliability was excellent (range of intraclass correlation coefficients: 0.991–0.997, all with narrow 95% confidence intervals). While the intraclass correlation coefficients suggest excellent inter-rater reliability between all three assessors (range of intraclass correlation coefficients: 0.952–0.986), the 95% confidence intervals were wide between assessor 3 and the other two assessors. Poor agreement with assessor 3 was also seen in the Bland-Altman plots. Criterion validity was very poor with a significant difference between the mean iSense measurement and the true rod volume (difference: 221.18 mL; p < 0.001). Conclusions: Although intra-rater reliability was excellent for the iSense scanner, we did not find similar results for inter-rater reliability and validity. These results suggest that further testing of the iSense scanner is required prior to use in clinical practice. Clinical relevance The iSense offers a low cost scanning option for residual limb volume measurement. Intra-rater reliability was excellent, but inter-rater reliability and validity were such that clinical adoption is not indicated at present.

Download Full-text

Inter-Rater Reliability: Intraclass Correlation Coefficients

Educational and Psychological Measurement ◽

10.1177/001316448104100127 ◽

1981 ◽

Vol 41 (1) ◽

pp. 223-226 ◽

Cited By ~ 6

Author(s):

Dong Won Cho

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Rater Reliability ◽

Intraclass Correlation Coefficients

Download Full-text