Koos Classification of Vestibular Schwannomas: A Reliability Study

Abstract BACKGROUND The Koos classification of vestibular schwannomas is designed to stratify tumors based on extrameatal extension and compression of the brainstem. While this classification system is widely reported in the literature, to date no study has assessed its reliability. OBJECTIVE To assess the intra- and inter-rater reliability of the Koos classification system. METHODS After institutional review board approval was obtained, a cross-sectional group of the Magnetic Resonance imagings of 40 patients with vestibular schwannomas varying in size comprised the study sample. Four raters were selected to assign a Koos grade to 50 total scans. Inter- and intrarater reliability were calculated and reported using Fleiss’ kappa, Kendall's W, and Intraclass correlation coefficient (ICC). RESULTS Inter-rater reliability was found to be substantial when measured using Fleiss' kappa (.71), extremely strong using Kendall's W (.92), and excellent as calculated by ICC (.88). Intrarater reliability was perfect for 3 out of 4 raters as assessed using weighted kappa, Kendall's W and ICC, with the intrarater agreement for the fourth rater measured as extremely high. CONCLUSION We have demonstrated that the Koos classification system for vestibular schwannoma is a reliable method for tumor classification. This study lends further support to the results of current literature using Koos grading system. Further studies are required to evaluate its validity and utility in counseling patients with regard to outcomes.

Download Full-text

Interrater and Intrarater Reliability of the Colloid Cyst Risk Score

Neurosurgery ◽

10.1093/neuros/nyz399 ◽

2019 ◽

Vol 86 (1) ◽

pp. E47-E53

Author(s):

Elizabeth N Alford ◽

Lauren E Rotman ◽

Jacob R Lepard ◽

Bonita S Agee ◽

James M Markert

Keyword(s):

Risk Score ◽

Third Ventricle ◽

Intraclass Correlation ◽

Weighted Kappa ◽

Colloid Cyst ◽

Intrarater Reliability ◽

Perfect Agreement ◽

Risk Zone ◽

Axial Diameter ◽

Kendall’S W

Abstract BACKGROUND The Colloid Cyst Risk Score (CCRS) was developed to identify symptomatic patients and stratify risk of hydrocephalus among patients with colloid cysts. Its components consider patient age, cyst diameter, presence/absence of headache, fluid-attenuated inversion recovery (FLAIR) hyperintensity, and location within the third ventricle. OBJECTIVE To independently evaluate the inter- and intrarater reliability of the CCRS. METHODS Patients with a colloid cyst were identified from billing records and radiology archives. Three independent raters reviewed electronic medical records to determine age, presence/absence of headache, cyst diameter (mm), FLAIR hyperintensity, and risk zone location. Raters made 53 observations, including 5 repeat observations. Fleiss’ generalized kappa (κ) was calculated for all of the nominal criteria, whereas Kendall's coefficient of concordance (W) and the intraclass correlation coefficient (ICC) were calculated for the overall score. RESULTS Total CCRS score demonstrated extremely strong agreement (W = 0.83) using Kendall's W coefficient and good agreement (ICC = 0.74) using the ICC (P < .001). For interrater reliability of individual criteria, age (κ = 1.00) and FLAIR hyperintensity (κ = 0.89) demonstrated near perfect agreement. Axial diameter (κ = 0.63) demonstrated substantial agreement, whereas agreement was moderate for risk zone (κ = 0.51) and fair for headache (κ = 0.26). Intrarater reliability for total CCRS score was extremely strong using Kendall's W, good to excellent using ICC, and fair to substantial using weighted kappa. CONCLUSION The CCRS has good inter- and intrarater reliability when tested in an independent sample of patients, though strength of agreement varies among individual criteria. The validity of the CCRS requires independent evaluation.

Download Full-text

The Interrater and Intrarater Reliability of the Preterm Infant Oral Feeding Readiness Assessment Scale

Journal of Modern Rehabilitation ◽

10.18502/jmr.v13i1.2144 ◽

2019 ◽

Author(s):

Faride kamran ◽

Setare Sagheb ◽

Mashid Aghajanzade ◽

Abbas Ebadi ◽

Yaser Faryadras ◽

...

Keyword(s):

Preterm Infant ◽

Intraclass Correlation ◽

Sudden Change ◽

Tube Feeding ◽

Weighted Kappa ◽

Oral Feeding ◽

Assessment Scale ◽

Intrarater Reliability ◽

Readiness Assessment ◽

Cross Sectional

Introduction: One of the most challenging decisions is to assess the preterm infant’s transition from tube feeding to oral feeding. Thus, we require a reliable tool for determining the time to start oral feeding. This study aimed to measure the interrater and intrarater reliability of the Preterm Infant Oral Feeding Readiness Assessment scale (PIOFRA). Materials and Methods: This study was an observational, cross-sectional study. The study participants were preterm infants who had been hospitalized in the Neonatal Intensive Care Unit of Shariati Hospital affiliated to Tehran University of Medical Sciences, between December 2017 and February 2018. The inclusion criteria were absence of neurological and gastrointestinal disorders or major congenital anomalies, Apgar score 3 or more in the first 1 minute, and 5 or more in the first 5 minutes. The exclusion criteria included family’s unwillingness to participate in the study, infant’s death, or a sudden change that affects neonates’ nutritional status, like cerebral hemorrhage or intestinal problems. Results: The interrater and intrarater reliability of the total PIOFRA scale was good Intraclass Correlation Coefficients (ICC˃0.75). The interrater and intrarater reliability of most items were good and excellent, with weighted kappa more than 0.50, with the exception of lip posture and especially stress sign, with weighted kappa less than 0.40. Conclusion: Generally, most items of the PIOFRA scale had acceptable interrater and intrarater reliability. Also, the interrater and intrarater reliabilities of the total POFRAs score was good.

Download Full-text

Comparison of Kristjansson Respiratory Score and Wang Respiratory Score in infants with bronchiolitis in a hospital emergency department

Hong Kong Physiotherapy Journal ◽

10.1142/s1013702520500146 ◽

2020 ◽

Vol 40 (02) ◽

pp. 145-153

Author(s):

Frederico Ramos Pinto ◽

Liane Correia-Costa ◽

Inês Azevedo

Keyword(s):

Emergency Department ◽

Emergency Room ◽

Internal Consistency ◽

Health Professionals ◽

Prospective Observational Study ◽

Intraclass Correlation ◽

Weighted Kappa ◽

Hospital Emergency ◽

Rater Reliability ◽

Hospital Emergency Room

Objective: Several respiratory scores have been created to evaluate bronchiolitis’ severity level, but it is still not clear which is the best score. The aim of this study is to compare the Wang Respiratory Score (WRS) and the Kristjansson Respiratory Score (KRS) in the setting of an emergency room. Methods: We performed a prospective observational study with 60 infants with bronchiolitis admitted to a paediatric emergency department. For both scores, we assessed inter-rater reliability between two different health professionals (physician and physiotherapist), internal consistency, and correlation with SpO2 testing the intraclass-correlation coefficient (ICC), weighted kappa, Cronbach [Formula: see text] coefficient and Spearman tests, respectively. Results: The inter-rater reliability was higher in KRS (ICC 0.79) and the Cronbach [Formula: see text] and weighted kappa had similar values in KRS versus WRS. The correlation between the KRS/WRS and SpO2 was poor/moderate upon admission and discharge for the first observer and the second observer. Conclusions: While the internal consistency was similar in both scores, inter-rater reliability of KRS was higher than WRS, which allows us to conclude that it would have more consistent results when used to assess bronchiolitis’ level of severity by health personnel in a busy hospital emergency room.

Download Full-text

A Novel and Reproducible Classification of the Vertebral Artery in the Subaxial Cervical Spine

Operative Neurosurgery ◽

10.1093/ons/opz310 ◽

2019 ◽

Vol 18 (6) ◽

pp. 676-683

Author(s):

Fabian Winter ◽

Ichiro Okano ◽

Stephan N Salzmann ◽

Colleen Rentenberger ◽

Jennifer Shue ◽

...

Keyword(s):

At Risk ◽

Cervical Spine ◽

High Risk ◽

Vertebral Artery ◽

Intrarater Reliability ◽

Cross Sectional ◽

Cervical Spine Surgery ◽

Subaxial Cervical Spine ◽

Anterior Posterior

Abstract BACKGROUND An injury of the vertebral artery (VA) is one of the most catastrophic complications in the setting of cervical spine surgery. Anatomic variations of the VA can increase the risk of iatrogenic lacerations. OBJECTIVE To propose a novel and reproducible classification system that describes the position of the VA based on a 2-dimensional map on computed tomography angiographs (CTA). METHODS This cross-sectional retrospective study reviewed 248 consecutive CTAs of the cervical spine at a single academic institution between 2007 and 2018. The classification consists of a number that characterizes the location of the VA from the medio-lateral (ML) aspect of the vertebral body. In addition, a letter describes the VA location from the anterior-posterior (AP) aspect. The reliability and reproducibility were assessed by 2 independent raters on 200 VAs. RESULTS The inter- and intrarater reliability values showed the classification's reproducibility. The inter-rater reliability weighted κ-value for the ML aspect was 0.93 (95% CI: 0.93-0.93). The unweighted κ-value was 0.93 (95% CI: 0.86-1.00) for “at-risk” positions (ML grade ≥1), and 0.87 (95% CI: 0.75-1.00) for “high-risk” positions (ML grade ≥2). The weighted κ-value for the intrarater reliability was 0.94 (95% CI: 0.95-0.95). The unweighted κ-values for the intrarater reliability were 0.95 (95% CI: 0.91-0.99) for “at-risk” positions, and 0.87 (95% CI: 0.78-0.96) for “high-risk” positions. CONCLUSION The proposed classification is reliable, reproducible, and independent of individual anatomic size variations. The use of this novel grading system could improve the understanding and interdisciplinary communication about VA anomalies.

Download Full-text

Visual assessment of movement quality in the single leg squat test: a review and meta-analysis of inter-rater and intrarater reliability

BMJ Open Sport & Exercise Medicine ◽

10.1136/bmjsem-2019-000541 ◽

2019 ◽

Vol 5 (1) ◽

pp. e000541 ◽

Cited By ~ 3

Author(s):

John Ressman ◽

Wilhelmus Johannes Andreas Grooten ◽

Eva Rasmussen Barr

Keyword(s):

Rating Scales ◽

Rating Scale ◽

Meta Analysis ◽

Intraclass Correlation ◽

Cochrane Library ◽

Intrarater Reliability ◽

Rater Reliability ◽

Movement Quality ◽

Step Down ◽

Single Leg Squat

Single leg squat (SLS) is a common tool used in clinical examination to set and evaluate rehabilitation goals, but also to assess lower extremity function in active people.ObjectivesTo conduct a review and meta-analysis on the inter-rater and intrarater reliability of the SLS, including the lateral step-down (LSD) and forward step-down (FSD) tests.DesignReview with meta-analysis.Data sourcesCINAHL, Cochrane Library, Embase, Medline (OVID) and Web of Science was searched up until December 2018.Eligibility criteriaStudies were eligible for inclusion if they were methodological studies which assessed the inter-rater and/or intrarater reliability of the SLS, FSD and LSD through observation of movement quality.ResultsThirty-one studies were included. The reliability varied largely between studies (inter-rater: kappa/intraclass correlation coefficients (ICC) = 0.00–0.95; intrarater: kappa/ICC = 0.13–1.00), but most of the studies reached ‘moderate’ measures of agreement. The pooled results of ICC/kappa showed a ‘moderate’ agreement for inter-rater reliability, 0.58 (95% CI 0.50 to 0.65), and a ‘substantial’ agreement for intrarater reliability, 0.68 (95% CI 0.60 to 0.74). Subgroup analyses showed a higher pooled agreement for inter-rater reliability of ≤3-point rating scales while no difference was found for different numbers of segmental assessments.ConclusionOur findings indicate that the SLS test including the FSD and LSD tests can be suitable for clinical use regardless of number of observed segments and particularly with a ≤3-point rating scale. Since most of the included studies were affected with some form of methodological bias, our findings must be interpreted with caution.PROSPERO registration numberCRD42018077822.

Download Full-text

A Comparison of Reliability Coefficients for Ordinal Rating Scales

Journal of Classification ◽

10.1007/s00357-021-09386-5 ◽

2021 ◽

Author(s):

Alexandra de Raadt ◽

Matthijs J. Warrens ◽

Roel J. Bosker ◽

Henk A. L. Kiers

Keyword(s):

Empirical Data ◽

Rating Scales ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Rater Reliability ◽

Intraclass Correlations ◽

Applied Researcher ◽

Highly Correlated ◽

Reliability Coefficients

AbstractKappa coefficients are commonly used for quantifying reliability on a categorical scale, whereas correlation coefficients are commonly applied to assess reliability on an interval scale. Both types of coefficients can be used to assess the reliability of ordinal rating scales. In this study, we compare seven reliability coefficients for ordinal rating scales: the kappa coefficients included are Cohen’s kappa, linearly weighted kappa, and quadratically weighted kappa; the correlation coefficients included are intraclass correlation ICC(3,1), Pearson’s correlation, Spearman’s rho, and Kendall’s tau-b. The primary goal is to provide a thorough understanding of these coefficients such that the applied researcher can make a sensible choice for ordinal rating scales. A second aim is to find out whether the choice of the coefficient matters. We studied to what extent we reach the same conclusions about inter-rater reliability with different coefficients, and to what extent the coefficients measure agreement in a similar way, using analytic methods, and simulated and empirical data. Using analytical methods, it is shown that differences between quadratic kappa and the Pearson and intraclass correlations increase if agreement becomes larger. Differences between the three coefficients are generally small if differences between rater means and variances are small. Furthermore, using simulated and empirical data, it is shown that differences between all reliability coefficients tend to increase if agreement between the raters increases. Moreover, for the data in this study, the same conclusion about inter-rater reliability was reached in virtually all cases with the four correlation coefficients. In addition, using quadratically weighted kappa, we reached a similar conclusion as with any correlation coefficient a great number of times. Hence, for the data in this study, it does not really matter which of these five coefficients is used. Moreover, the four correlation coefficients and quadratically weighted kappa tend to measure agreement in a similar way: their values are very highly correlated for the data in this study.

Download Full-text

Interrater Reliability of the Berg Balance Scale When Used by Clinicians of Various Experience Levels to Assess People With Lower Limb Amputations

Physical Therapy ◽

10.2522/ptj.20130182 ◽

2014 ◽

Vol 94 (3) ◽

pp. 371-378 ◽

Cited By ~ 21

Author(s):

Christopher K. Wong

Keyword(s):

Lower Limb ◽

Interrater Reliability ◽

Clinical Training ◽

Intraclass Correlation ◽

Berg Balance Scale ◽

Intrarater Reliability ◽

Rater Reliability ◽

Study Objective ◽

Balance Scale ◽

Scale Scores

Background People with lower limb amputations frequently have impaired balance ability. The Berg Balance Scale (BBS) has excellent psychometric properties for people with neurologic disorders and elderly people dwelling in the community. A Rasch analysis demonstrated the validity of the BBS for people with lower limb amputations of all ability strata, but rater reliability has not been tested. Objective The study objective was to determine the interrater reliability and intrarater reliability of BBS scores and the differences in scores assigned by testers with various levels of experience when assessing people with lower limb amputations. Design This reliability study of video-recorded single-session BBS assessments had a cross-sectional design. Methods From a larger study of people with lower limb amputations, 5 consecutively recruited participants using prostheses were video recorded during an in-person BBS assessment. Sixteen testers independently rated the video-recorded assessments. Testers were 3 physical therapists, 1 occupational therapist, 3 third-year and 4 second-year doctor of physical therapy (DPT) students, and 5 first-year DPT students without clinical training. Rater reliability was calculated using intraclass correlation coefficients (ICC [2,k]). Differences in scores assigned by testers with various levels of experience were determined by use of an analysis of variance with Tukey post hoc tests. Results The average age of the participants was 53.0 years (SD=15.7). Amputations had occurred at the ankle disarticulation, transtibial, and transfemoral levels because of vascular, trauma, and medical etiologies an average of 8.2 years earlier (SD=7.9). Berg Balance Scale scores spanned all ability strata. Interrater reliability (ICC [2,k]=.99) and intrarater reliability of scores determined in person and through video-recorded assessments by the same testers (ICC [2,k]=.99) were excellent. For participants with the lowest levels of ability, licensed professionals assigned lower scores than did DPT students without clinical training. Limitations Intrarater reliability calculations were based on 2 testers. Conclusions Berg Balance Scale scores assigned to people using prostheses by testers with various levels of clinical experience had excellent interrater reliability and intrarater reliability.

Download Full-text

Interrater and Intrarater Reliability of the Kuntz et al New Deformity Classification System

Neurosurgery ◽

10.1227/neu.0b013e31824f4e58 ◽

2012 ◽

Vol 71 (1) ◽

pp. 47-57 ◽

Cited By ~ 6

Author(s):

Martin Thaler ◽

Ricarda Lechner ◽

Michaela Gstöttner ◽

Matthias Luegmair ◽

Michael Liebensteiner ◽

...

Keyword(s):

Classification System ◽

Interobserver Agreement ◽

Intraobserver Agreement ◽

Spinal Alignment ◽

Intrarater Reliability ◽

Surgical Interventions ◽

Apical Vertebral Rotation ◽

Age Dependent ◽

The Mean

Abstract BACKGROUND: Kuntz et al recently introduced a new system for classifying spinal deformities. This classification of spinal deformity was developed from age-dependent deviations from the neutral upright spinal alignment. OBJECTIVE: To determine the interobserver and intraobserver reliabilities of the new Kuntz et al system for classifying scoliosis. METHODS: Fifty consecutive patients were evaluated. Three observers independently assigned a major structural curve, minor structural curve, curve type, apical vertebral rotation, spinal balance, and pelvic alignment to each curve following the guidelines described by Kuntz et al. Assignment of the curves was repeated 4 weeks later, with the curves presented in a different blinded order. The Kendall W and Holsti agreement coefficients were used to determine the interobserver and intraobserver agreement. RESULTS: The intraobserver value of agreement for all parameters was 0.85 (range, 0.28-1.0), and the mean Kendall W coefficient was 0.89 (range, 0.5-0.97), demonstrating perfect reliability. The interobserver agreement averaged 0.7 (range, 0.251-1.0). The mean Kendall W coefficient was 0.67 (range, 0.19-1.0), demonstrating substantial reliability. The average time for classification of 1 curve was approximately 8.4 minutes. CONCLUSION: The new Kuntz et al deformity classification system is comparable to the Lenke et al system in terms of reliability. However, the Kuntz et al classification system provides no recommendations for surgical interventions. It is more complex and time-consuming and therefore may be of limited value in daily clinical practice.

Download Full-text

The revision partial knee classification system: understanding the causative pathology and magnitude of further surgery following partial knee arthroplasty

Bone & Joint Open ◽

10.1302/2633-1462.28.bjo-2021-0086.r1 ◽

2021 ◽

Vol 2 (8) ◽

pp. 638-645

Author(s):

Amy J. Garner ◽

Thomas C. Edwards ◽

Alexander D. Liddle ◽

Gareth G. Jones ◽

Justin P. Cobb

Keyword(s):

Health Economics ◽

Knee Arthroplasty ◽

Classification System ◽

Knee Prosthesis ◽

Rater Reliability ◽

Partial Knee Arthroplasty ◽

Consensus Group ◽

Interclass Correlation ◽

Total Knee ◽

Kendall’S W

Aims Joint registries classify all further arthroplasty procedures to a knee with an existing partial arthroplasty as revision surgery, regardless of the actual procedure performed. Relatively minor procedures, including bearing exchanges, are classified in the same way as major operations requiring augments and stems. A new classification system is proposed to acknowledge and describe the detail of these procedures, which has implications for risk, recovery, and health economics. Methods Classification categories were proposed by a surgical consensus group, then ranked by patients, according to perceived invasiveness and implications for recovery. In round one, 26 revision cases were classified by the consensus group. Results were tested for inter-rater reliability. In round two, four additional cases were added for clarity. Round three repeated the survey one month later, subject to inter- and intrarater reliability testing. In round four, five additional expert partial knee arthroplasty surgeons were asked to classify the 30 cases according to the proposed revision partial knee classification (RPKC) system. Results Four classes were proposed: PR1, where no bone-implant interfaces are affected; PR2, where surgery does not include conversion to total knee arthroplasty, for example, a second partial arthroplasty to a native compartment; PR3, when a standard primary total knee prosthesis is used; and PR4 when revision components are necessary. Round one resulted in 92% inter-rater agreement (Kendall’s W 0.97; p < 0.005), rising to 93% in round two (Kendall’s W 0.98; p < 0.001). Round three demonstrated 97% agreement (Kendall’s W 0.98; p < 0.001), with high intra-rater reliability (interclass correlation coefficient (ICC) 0.99; 95% confidence interval 0.98 to 0.99). Round four resulted in 80% agreement (Kendall’s W 0.92; p < 0.001). Conclusion The RPKC system accounts for all procedures which may be appropriate following partial knee arthroplasty. It has been shown to be reliable, repeatable and pragmatic. The implications for patient care and health economics are discussed. Cite this article: Bone Jt Open 2021;2(8):638–645.

Download Full-text