Sport and exercise medicine consultants are reliable in assessing tendon neovascularity using ultrasound Doppler

ObjectiveSeveral lower limb tendinopathy treatment modalities involve identification of pathological paratendinous or intratendinous neovascularisation to target proposed co-location of painful neoneuralisation. The ability to reliably locate and assess the degree of neovascularity is therefore clinically important. The Modified Ohberg Score (MOS) is frequently used to determine degree of neovascularity, but reliability has yet to be established among Sport and Exercise Medicine (SEM) consultants. This study aims to determine inter-rater and intra-rater reliability of an SEM consultant cohort when assessing neovascularity using the 5-point MOS.MethodEleven participants (7 male and 4 female) provided 16 symptomatic Achilles and patella tendons. These were sequentially examined using power Doppler (PD) enabled ultrasound (US) imaging by 6 SEM consultants who rated neovascular changes seen using the MOS. Representative digital scan images were saved for rescoring 3 weeks later. Inter-rater and intra-rater reliability of the MOS was examined using intraclass correlation coefficient (ICC) and Kappa Agreement scores.ResultsNeovascular changes were reported in 65.6% of 96 scans undertaken. ICC for inter-rater reliability was 0.86 and Fleiss Kappa 0.52. ICC for intra-rater reliability was 0.95 and Weighted Kappa 0.91.ConclusionsNeovascular changes were present in two-thirds of symptomatic tendons. Excellent SEM consultant inter-rater and intra-rater reliability was demonstrated. These findings support the use of PD-enabled US to assess neovascularity by appropriately experienced SEM consultants. Furthermore, future interventional research using a similarly experienced SEM consultant cohort can be undertaken with assurance that assessment of neovascularity will be reliable.

Download Full-text

Validity and Reliability of the Persian Version of Language Screening Test (LAST) for Patients in the Acute Phase of Stroke

Function and Disability Journal ◽

10.32598/fdj.3.13 ◽

2020 ◽

Vol 3 (1) ◽

pp. 91-100

Author(s):

Seyyede Zohreh Mousavi ◽

◽

Reyhaneh Jafari ◽

Saman Maroufizadeh3 ◽

Mohammad Moez Shahramnia ◽

...

Keyword(s):

Correlation Coefficient ◽

Acute Phase ◽

Screening Test ◽

Intraclass Correlation ◽

Chronic Phase ◽

Weighted Kappa ◽

Screening Tests ◽

Rater Reliability ◽

Persian Version ◽

Language Screening

Background & Objectives: Aphasia is one of the most common consequences of a stroke; thus, screening tests for early diagnosis of the problem are necessary when dealing with aphasia patients. One of these screening tests is the Language Screening Test (LAST). The purpose of this study was to translate, validate, and utilize this test in the Persian language for patients after stroke. Methods: The original version of LAST was translated into Persian, and then administrated on 100 patients in the acute phase by two examiners at the patient’s bedside in order to check the inter-rater reliability. To assess the agreement between the two forms (a and b) of the LAST, Concordance Correlation Coefficient (CCC), weighted Kappa, and Intraclass Correlation Coefficient (ICC) were used. Also, the Persian version of LAST and the Western Aphasia Battery (WAB) were performed at the chronic phase with two independent examiners with blind scoring. Results: Inter-rater reliability between Rater 1 and Rater 2 on LAST-a and LAST-b score were very good for both phases. The CCC for LAST-a and LAST-b, respectively, were 0.874 and 0.865 for the acute phase and 0.923 and 0.927 for the chronic phase. The weighted Kappa for LAST-a and LAST-b, respectively, were 0.750 and 0.740 for the acute phase, and 0.822 and 0.846 for the chronic phase. Conclusion: The obtained results showed that LAST is a very simple, fast, and valid test and can be used as a reliable tool in stroke patients. Lack of cultural and language dependency are the advantages of using this test.

Download Full-text

Development of the Huddle Observation Tool for structured case management discussions to improve situation awareness on inpatient clinical wards

BMJ Quality & Safety ◽

10.1136/bmjqs-2017-006513 ◽

2017 ◽

Vol 27 (5) ◽

pp. 365-372 ◽

Cited By ~ 8

Author(s):

Julian Edbrooke-Childs ◽

Jacqueline Hayes ◽

Evelyn Sharples ◽

Dawid Gondek ◽

Emily Stapley ◽

...

Keyword(s):

Case Management ◽

Correlation Coefficient ◽

Intraclass Correlation Coefficient ◽

Situation Awareness ◽

Assessment Tool ◽

Intraclass Correlation ◽

Weighted Kappa ◽

Observational Assessment ◽

Rater Reliability ◽

Team Processes

Background‘Situation Awareness For Everyone’ (SAFE) was a 3-year project which aimed to improve situation awareness in clinical teams in order to detect potential deterioration and other potential risks to children on hospital wards. The key intervention was the ‘huddle’, a structured case management discussion which is central to facilitating situation awareness. This study aimed to develop an observational assessment tool to assess the team processes occurring during huddles, including the effectiveness of the huddle.MethodsA cross-sectional observational design was used to psychometrically develop the ‘Huddle Observation Tool’ (HOT) over three phases using standardised psychometric methodology. Huddles were observed across four NHS paediatric wards participating in SAFE by five researchers; two wards within specialist children hospitals and two within district general hospitals, with location, number of beds and length of stay considered to make the sample as heterogeneous as possible. Inter-rater reliability was calculated using the weighted kappa and intraclass correlation coefficient.ResultsInter-rater reliability was acceptable for the collaborative culture (weighted kappa=0.32, 95% CI 0.17 to 0.42), environment items (weighted kappa=0.78, 95% CI 0.52 to 1) and total score (intraclass correlation coefficient=0.87, 95% CI 0.68 to 0.95). It was lower for the structure and risk management items, suggesting that these were more variable in how observers rated them. However, agreement on the global score for huddles was acceptable.ConclusionWe developed an observational assessment tool to assess the team processes occurring during huddles, including the effectiveness of the huddle. Future research should examine whether observational evaluations of huddles are associated with other indicators of safety on clinical wards (eg, safety climate and incidents of patient harm), and whether scores on the HOT are associated with improved situation awareness and reductions in deterioration and adverse events in clinical settings, such as inpatient wards.

Download Full-text

Development and Relative Validity of a Semiquantitative Food Frequency Questionnaire to Estimate Dietary Intake among a Multi-Ethnic Population in the Malaysian Cohort Project

Nutrients ◽

10.3390/nu13041163 ◽

2021 ◽

Vol 13 (4) ◽

pp. 1163

Author(s):

Suzana Shahar ◽

Mohd Razif Shahril ◽

Noraidatulakma Abdullah ◽

Boekhtiar Borhanuddin ◽

Mohd Arman Kamaruddin ◽

...

Keyword(s):

Vitamin C ◽

Correlation Coefficient ◽

Food Frequency Questionnaire ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Dietary Intakes ◽

Food Frequency ◽

Relative Validity ◽

Spearman’S Correlation

Measuring dietary intakes in a multi-ethnic and multicultural setting, such as Malaysia, remains a challenge due to its diversity. This study aims to develop and evaluate the relative validity of an interviewer-administered food frequency questionnaire (FFQ) in assessing the habitual dietary exposure of The Malaysian Cohort (TMC) participants. We developed a nutrient database (with 203 items) based on various food consumption tables, and 803 participants were involved in this study. The output of the FFQ was then validated against three-day 24-h dietary recalls (n = 64). We assessed the relative validity and its agreement using various methods, such as Spearman’s correlation, weighed Kappa, intraclass correlation coefficient (ICC), and Bland–Altman analysis. Spearman’s correlation coefficient ranged from 0.24 (vitamin C) to 0.46 (carbohydrate), and almost all nutrients had correlation coefficients above 0.3, except for vitamin C and sodium. Intraclass correlation coefficients ranged from −0.01 (calcium) to 0.59 (carbohydrates), and weighted Kappa exceeded 0.4 for 50% of nutrients. In short, TMC’s FFQ appears to have good relative validity for the assessment of nutrient intake among its participants, as compared to the three-day 24-h dietary recalls. However, estimates for iron, vitamin A, and vitamin C should be interpreted with caution.

Download Full-text

Intra-rater reliability of transversus abdominis measurement by a novice examiner: Comparison of “freehand” to “probe force device” method of real-time ultrasound imaging

Ultrasound ◽

10.1177/1742271x19831720 ◽

2019 ◽

Vol 27 (3) ◽

pp. 156-166 ◽

Cited By ~ 1

Author(s):

Vanessa L Kennedy ◽

Carol A Flavell ◽

Kenji Doma

Keyword(s):

Measurement Error ◽

Real Time ◽

Correlation Coefficient ◽

Intraclass Correlation Coefficient ◽

Coefficient Of Variation ◽

Repeated Measures ◽

Intraclass Correlation ◽

Transversus Abdominis ◽

Rater Reliability ◽

Transverse Abdominis

A “free hand” real-time-ultrasound method is commonly applied to measure transversus abdominis. Potentially, this increases transversus abdominis measurement error due to uncontrolled variability in probe to skin force, inclination, and roll, particularly for novice examiners. This single-group repeated-measures reliability study compared the intra-rater reliability of transversus abdominis thickness and activation measurement by a novice examiner between free hand and a standardized probe force device method. The examiner captured ultrasound videos of transversus abdominis in a single session in healthy participants ( n = 33). Free hand ultrasound featured uncontrolled probe force, inclination, and roll, while probe force device method ultrasound standardized these parameters. Images of transversus abdominis at rest and contracted were measured and transversus abdominis activation calculated. Intraclass correlation coefficient, coefficient of variation, standard error of measurement, and worthwhile differences were calculated. The probe force device method resulted in greater reliability (intraclass correlation coefficient = 0.75–0.96) and lower measurement error (coefficient of variation = 8.89–28.7%) compared to free hand (intraclass correlation coefficient = 0.63–0.93; coefficient of variation = 6.52–29.4%). Reliability was good for all measurements except free hand TrA-C, which was moderate. TrA-C had the lowest reliability, followed by contracted thickness of the transverse abdominis, with resting thickness of the transverse abdominis being highest. Worthwhile differences were lower using a probe force device method versus free hand for resting thickness of the transverse abdominis and contracted thickness of the transverse abdominis and similar for TrA-C. Standardization using probe force device method ultrasound to measure transversus abdominis improved intra-rater reliability in a novice examiner. Use of a probe force device method is recommended to improve reliability through reduced sources of measurement error. Probe force device method intra- and inter-rater reliability in examiners of varying experience, in clinical populations, and to visualize other structures merits exploration.

Download Full-text

Comparison of Kristjansson Respiratory Score and Wang Respiratory Score in infants with bronchiolitis in a hospital emergency department

Hong Kong Physiotherapy Journal ◽

10.1142/s1013702520500146 ◽

2020 ◽

Vol 40 (02) ◽

pp. 145-153

Author(s):

Frederico Ramos Pinto ◽

Liane Correia-Costa ◽

Inês Azevedo

Keyword(s):

Emergency Department ◽

Emergency Room ◽

Internal Consistency ◽

Health Professionals ◽

Prospective Observational Study ◽

Intraclass Correlation ◽

Weighted Kappa ◽

Hospital Emergency ◽

Rater Reliability ◽

Hospital Emergency Room

Objective: Several respiratory scores have been created to evaluate bronchiolitis’ severity level, but it is still not clear which is the best score. The aim of this study is to compare the Wang Respiratory Score (WRS) and the Kristjansson Respiratory Score (KRS) in the setting of an emergency room. Methods: We performed a prospective observational study with 60 infants with bronchiolitis admitted to a paediatric emergency department. For both scores, we assessed inter-rater reliability between two different health professionals (physician and physiotherapist), internal consistency, and correlation with SpO2 testing the intraclass-correlation coefficient (ICC), weighted kappa, Cronbach [Formula: see text] coefficient and Spearman tests, respectively. Results: The inter-rater reliability was higher in KRS (ICC 0.79) and the Cronbach [Formula: see text] and weighted kappa had similar values in KRS versus WRS. The correlation between the KRS/WRS and SpO2 was poor/moderate upon admission and discharge for the first observer and the second observer. Conclusions: While the internal consistency was similar in both scores, inter-rater reliability of KRS was higher than WRS, which allows us to conclude that it would have more consistent results when used to assess bronchiolitis’ level of severity by health personnel in a busy hospital emergency room.

Download Full-text

Validity and reliability of assessing diaphragmatic mobility by area on X-rays of healthy subjects

Jornal Brasileiro de Pneumologia ◽

10.1590/s1806-37562016000000131 ◽

2018 ◽

Vol 44 (3) ◽

pp. 220-226 ◽

Cited By ~ 1

Author(s):

Aline Pedrini ◽

Márcia Aparecida Gonçalves ◽

Bruna Estima Leal ◽

Michelle Gonçalves de Souza Tavares ◽

Wellington Pereira Yamaguti ◽

...

Keyword(s):

Correlation Coefficient ◽

Intraclass Correlation ◽

X Rays ◽

Altman Analysis ◽

Validity And Reliability ◽

Anthropometric Parameters ◽

Rater Reliability ◽

Bland Altman Analysis ◽

Left Hemidiaphragm ◽

The Right

ABSTRACT Objective: To investigate the concurrent validity, as well as the intra- and inter-rater reliability, of assessing diaphragmatic mobility by area (DMarea) on chest X-rays of healthy adults. Methods: We evaluated anthropometric parameters, pulmonary function, and diaphragmatic mobility in 43 participants. Two observers (rater A and rater B) determined diaphragmatic mobility at two time points. We used Pearson’s correlation coefficient to evaluate the correlation between DMarea and the assessment of diaphragmatic mobility by distance (DMdist). To evaluate intra- and inter-rater reliability, we used the intraclass correlation coefficient (ICC [2,1]), 95% CI, and Bland-Altman analysis. Results: A significant correlation was found between the DMarea and DMdist methods (r = 0.743; p < 0.0001). For DMarea, the intra-rater reliability was found to be quite high for the right hemidiaphragm (RHD)-ICC (2,1) = 0.92 (95% CI: 0.86-0.95) for rater A and ICC (2,1) = 0.90 (95% CI: 0.84-0.94) for rater B-and the left hemidiaphragm (LHD)-ICC (2,1) = 0.96 (95% CI: 0.93-0.97) for rater A and ICC (2,1) = 0.91 (95% CI: 0.81-0.95) for rater B-(p < 0.0001 for all). Also for DMarea, the inter-rater reliability was found to be quite high for the first and second evaluations of the RHD-ICC (2,1) = 0.99 (95% CI: 0.98-0.99) and ICC (2,1) = 0.95 (95% CI: 0.86-0.97), respectively-and the LHD-ICC (2,1) = 0.99 (95% CI: 0.98-0.99) and ICC (2,1) = 0.94 (95% CI: 0.87-0.97)-(p < 0.0001 for both). The Bland-Altman analysis showed good agreement between the mobility of the RHD and that of the LHD. Conclusions: The DMarea method proved to be a valid, reliable measure of diaphragmatic mobility.

Download Full-text

A Comparison of Reliability Coefficients for Ordinal Rating Scales

Journal of Classification ◽

10.1007/s00357-021-09386-5 ◽

2021 ◽

Author(s):

Alexandra de Raadt ◽

Matthijs J. Warrens ◽

Roel J. Bosker ◽

Henk A. L. Kiers

Keyword(s):

Empirical Data ◽

Rating Scales ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Rater Reliability ◽

Intraclass Correlations ◽

Applied Researcher ◽

Highly Correlated ◽

Reliability Coefficients

AbstractKappa coefficients are commonly used for quantifying reliability on a categorical scale, whereas correlation coefficients are commonly applied to assess reliability on an interval scale. Both types of coefficients can be used to assess the reliability of ordinal rating scales. In this study, we compare seven reliability coefficients for ordinal rating scales: the kappa coefficients included are Cohen’s kappa, linearly weighted kappa, and quadratically weighted kappa; the correlation coefficients included are intraclass correlation ICC(3,1), Pearson’s correlation, Spearman’s rho, and Kendall’s tau-b. The primary goal is to provide a thorough understanding of these coefficients such that the applied researcher can make a sensible choice for ordinal rating scales. A second aim is to find out whether the choice of the coefficient matters. We studied to what extent we reach the same conclusions about inter-rater reliability with different coefficients, and to what extent the coefficients measure agreement in a similar way, using analytic methods, and simulated and empirical data. Using analytical methods, it is shown that differences between quadratic kappa and the Pearson and intraclass correlations increase if agreement becomes larger. Differences between the three coefficients are generally small if differences between rater means and variances are small. Furthermore, using simulated and empirical data, it is shown that differences between all reliability coefficients tend to increase if agreement between the raters increases. Moreover, for the data in this study, the same conclusion about inter-rater reliability was reached in virtually all cases with the four correlation coefficients. In addition, using quadratically weighted kappa, we reached a similar conclusion as with any correlation coefficient a great number of times. Hence, for the data in this study, it does not really matter which of these five coefficients is used. Moreover, the four correlation coefficients and quadratically weighted kappa tend to measure agreement in a similar way: their values are very highly correlated for the data in this study.

Download Full-text

Learning how to Differ: Agreement and Reliability Statistics in Psychiatry

The Canadian Journal of Psychiatry ◽

10.1177/070674379504000202 ◽

1995 ◽

Vol 40 (2) ◽

pp. 60-66 ◽

Cited By ~ 46

Author(s):

L. Streiner David

Keyword(s):

Correlation Coefficient ◽

Intraclass Correlation Coefficient ◽

Intraclass Correlation ◽

Weighted Kappa ◽

The Other ◽

Cohen’S Kappa ◽

Cohen's Kappa ◽

The Subject

Whenever two or more raters evaluate a patient or student, it may be necessary to determine the degree to which they assign the same label or rating to the subject. The major problem in deciding which statistic to use is the plethora of different techniques which are available. This paper reviews some of the more commonly used techniques, such as Raw Agreement, Cohen's kappa and weighted kappa, and shows that, in most circumstances, they can all be replaced by the intraclass correlation coefficient (ICC). This paper also shows how the ICC can be used in situations where the other statistics cannot be used and how to select the best subset of raters.

Download Full-text

Interrater Reliability of the Berg Balance Scale When Used by Clinicians of Various Experience Levels to Assess People With Lower Limb Amputations

Physical Therapy ◽

10.2522/ptj.20130182 ◽

2014 ◽

Vol 94 (3) ◽

pp. 371-378 ◽

Cited By ~ 21

Author(s):

Christopher K. Wong

Keyword(s):

Lower Limb ◽

Interrater Reliability ◽

Clinical Training ◽

Intraclass Correlation ◽

Berg Balance Scale ◽

Intrarater Reliability ◽

Rater Reliability ◽

Study Objective ◽

Balance Scale ◽

Scale Scores

Background People with lower limb amputations frequently have impaired balance ability. The Berg Balance Scale (BBS) has excellent psychometric properties for people with neurologic disorders and elderly people dwelling in the community. A Rasch analysis demonstrated the validity of the BBS for people with lower limb amputations of all ability strata, but rater reliability has not been tested. Objective The study objective was to determine the interrater reliability and intrarater reliability of BBS scores and the differences in scores assigned by testers with various levels of experience when assessing people with lower limb amputations. Design This reliability study of video-recorded single-session BBS assessments had a cross-sectional design. Methods From a larger study of people with lower limb amputations, 5 consecutively recruited participants using prostheses were video recorded during an in-person BBS assessment. Sixteen testers independently rated the video-recorded assessments. Testers were 3 physical therapists, 1 occupational therapist, 3 third-year and 4 second-year doctor of physical therapy (DPT) students, and 5 first-year DPT students without clinical training. Rater reliability was calculated using intraclass correlation coefficients (ICC [2,k]). Differences in scores assigned by testers with various levels of experience were determined by use of an analysis of variance with Tukey post hoc tests. Results The average age of the participants was 53.0 years (SD=15.7). Amputations had occurred at the ankle disarticulation, transtibial, and transfemoral levels because of vascular, trauma, and medical etiologies an average of 8.2 years earlier (SD=7.9). Berg Balance Scale scores spanned all ability strata. Interrater reliability (ICC [2,k]=.99) and intrarater reliability of scores determined in person and through video-recorded assessments by the same testers (ICC [2,k]=.99) were excellent. For participants with the lowest levels of ability, licensed professionals assigned lower scores than did DPT students without clinical training. Limitations Intrarater reliability calculations were based on 2 testers. Conclusions Berg Balance Scale scores assigned to people using prostheses by testers with various levels of clinical experience had excellent interrater reliability and intrarater reliability.

Download Full-text

Validity and reliability of smartphone-based application for chronic ankle instability

International Journal of Therapy and Rehabilitation ◽

10.12968/ijtr.2021.0007 ◽

2021 ◽

Vol 28 (9) ◽

pp. 1-10

Author(s):

Taelim Yoon ◽

Jihyun Lee

Keyword(s):

Medical Device ◽

Correlation Coefficient ◽

Intraclass Correlation Coefficient ◽

Ankle Instability ◽

Intraclass Correlation ◽

Validity And Reliability ◽

Rater Reliability ◽

Eyes Closed ◽

Eyes Open ◽

Cumberland Ankle Instability Tool

Background/aims Ankle instability is one of the most common injuries that can occur during everyday life, sports and exercise. Recently, smartphone accelerometers have been used to measure single leg balance associated with ankle instability, because they are easy to use, inexpensive and can be used in small spaces. Thus, the purpose of this study was to introduce and investigate the intra- and inter-rater reliability of the smartphone accelerometer when assessing ankle instability. Methods A total of 26 individuals who had ankle instability were recruited. The single leg stance balance was measured using a smartphone accelerometer (Accelerometer application) and a force platform (I-Balance) for 5 seconds with their eyes open or their eyes closed. Results In the eyes open position, intra-rater reliability of the smartphone accelerometer was excellent for both raters (intraclass correlation coefficient: 0.87–0.90); and the inter-rater reliability was moderate (intraclass correlation coefficient: 0.71). In the eyes closed position, the intra-rater reliability of the smartphone accelerometer was excellent for both raters (intraclass correlation coefficient: 0.90–0.93); the inter-rater reliability was good (intraclass correlation coefficient: 0.82). Additionally, there were fair positive correlations between the smartphone accelerometer and the Cumberland Ankle Instability Tool, and between the smartphone accelerometer and I-Balance (r=0.33, 0.30 respectively). Conclusions The present study demonstrated excellent intra-rater reliabilities of two raters and moderate to good inter-rater reliabilities. The smartphone accelerometer offers several important advantages as a potential portable medical device to assess ankle instability accurately. Although there was a positive correlation, the relationships between the smartphone accelerometer and Cumberland Ankle Instability Tool and that between the smartphone accelerometer and I-Balance were fair. Future studies should investigate the validity of the smartphone accelerometer as a portable medical device for determining ankle instability.

Download Full-text