The Equivalence of Multiple Rater Kappa Statistics and Intraclass Correlation Coefficients

1988 ◽  
Vol 48 (2) ◽  
pp. 367-374 ◽  
Author(s):  
Gordon Rae
2005 ◽  
Vol 17 (1) ◽  
pp. 29-35 ◽  
Author(s):  
M. Zhang ◽  
CW Binns ◽  
AH Lee

This study describes the development and reproducibility of a 128-item quantitative food frequency questionnaire (FFQ) to measure usual food consumption for women in southeast China. The FFQ was pre-tested using 51 Chinese women who recently migrated to Australia. Cronbach's alpha coefficient was 0.81 for internal consistency. The reliability of the FFQ was then assessed by another test-retest study. A sample of 41 women residing in southeast China was interviewed twice within 12 weeks. Intraclass correlation coefficients were moderate to high for mean food group consumption (0.43-0.96) and mean daily nutrient intakes (0.47-0.89). Kappa statistics for eating habits ranged from 0.27 to 0.89 in the test-retest. The mean ratio of energy intake to basal metabolic rate was 1.73 (S.D. 0.39) in both test and retest samples. The study confirmed that the FFQ method using standard containers is appropriate to assess dietary intake for women in southeast China. Asia Pac J Public Health 2005: 17(1): 29-35.


2014 ◽  
Vol 40 (3) ◽  
pp. 250-258 ◽  
Author(s):  
Jéssica Gonçalves ◽  
Emilio Pizzichini ◽  
Marcia Margaret Menezes Pizzichini ◽  
Leila John Marques Steidle ◽  
Cristiane Cinara Rocha ◽  
...  

Objective: To determine the reliability of a rapid hematology stain for the cytological analysis of induced sputum samples. Methods: This was a cross-sectional study comparing the standard technique (May-Grünwald-Giemsa stain) with a rapid hematology stain (Diff-Quik). Of the 50 subjects included in the study, 21 had asthma, 19 had COPD, and 10 were healthy (controls). From the induced sputum samples collected, we prepared four slides: two were stained with May-Grünwald-Giemsa, and two were stained with Diff-Quik. The slides were read independently by two trained researchers blinded to the identification of the slides. The reliability for cell counting using the two techniques was evaluated by determining the intraclass correlation coefficients (ICCs) for intraobserver and interobserver agreement. Agreement in the identification of neutrophilic and eosinophilic sputum between the observers and between the stains was evaluated with kappa statistics. Results: In our comparison of the two staining techniques, the ICCs indicated almost perfect interobserver agreement for neutrophil, eosinophil, and macrophage counts (ICC: 0.98-1.00), as well as substantial agreement for lymphocyte counts (ICC: 0.76-0.83). Intraobserver agreement was almost perfect for neutrophil, eosinophil, and macrophage counts (ICC: 0.96-0.99), whereas it was moderate to substantial for lymphocyte counts (ICC = 0.65 and 0.75 for the two observers, respectively). Interobserver agreement for the identification of eosinophilic and neutrophilic sputum using the two techniques ranged from substantial to almost perfect (kappa range: 0.91-1.00). Conclusions: The use of Diff-Quik can be considered a reliable alternative for the processing of sputum samples.


2009 ◽  
Vol 12 (12) ◽  
pp. 2359-2365 ◽  
Author(s):  
Torukiri I Ibiebele ◽  
Sanjoti Parekh ◽  
Kylie-ann Mallitt ◽  
Maria Celia Hughes ◽  
Peter K O’Rourke ◽  
...  

AbstractObjectiveTo assess the reproducibility of a 135-item self-administered semi-quantitative FFQ.DesignControl subjects who had previously completed an FFQ relating to usual dietary intake in a nationwide case–control study of cancer between November 2003 and April 2004 were randomly selected, re-contacted, and invited to complete the same FFQ a second time approximately one year later (between January and April 2005). Agreement between the two FFQ was compared using weighted kappa statistics and intraclass correlation coefficients (ICC) for food groups and nutrients. Summary questions, included in the FFQ, were used to assess overall intakes of vegetables, fruits and meat.SettingGeneral community in Australia.SubjectsOne hundred men and women aged 22–79 years, randomly selected from the previous control population.ResultsThe weighted κ and ICC measures of agreement for food groups were moderate to substantial for seventeen of the eighteen food groups. For nutrients, weighted κ ranged from 0·44 for starch to 0·83 for alcohol while ICC ranged from 0·51 to 0·91 for the same nutrients. Estimates of meat, fruit and vegetable intake using summary questions were similar for both survey periods, but were significantly lower than estimates from summed individual food items.ConclusionsThe FFQ produced reproducible results and is reasonable in assessing the usual intake of various foods and nutrients among an Australian adult population.


2018 ◽  
Vol 30 (3) ◽  
pp. 377-385 ◽  
Author(s):  
Patti K. Kiser ◽  
Christiane V. Löhr ◽  
Danielle Meritet ◽  
Sean T. Spagnoli ◽  
Milan Milovancev ◽  
...  

Although quantitative assessment of margins is recommended for describing excision of cutaneous malignancies, there is poor understanding of limitations associated with this technique. We described and quantified histologic artifacts in inked margins and determined the association between artifacts and variance in histologic tumor-free margin (HTFM) measurements based on a novel grading scheme applied to 50 sections of normal canine skin and 56 radial margins taken from 15 different canine mast cell tumors (MCTs). Three broad categories of artifact were 1) tissue deformation at inked edges, 2) ink-associated artifacts, and 3) sectioning-associated artifacts. The most common artifacts in MCT margins were ink-associated artifacts, specifically ink absent from an edge (mean prevalence: 50%) and inappropriate ink coloring (mean: 45%). The prevalence of other artifacts in MCT skin was 4–50%. In MCT margins, frequency-adjusted kappa statistics found fair or better inter-rater reliability for 9 of 10 artifacts; intra-rater reliability was moderate or better in 9 of 10 artifacts. Digital HTFM measurements by 5 blinded pathologists had a median standard deviation (SD) of 1.9 mm (interquartile range: 0.8–3.6 mm; range: 0–6.2 mm). Intraclass correlation coefficients demonstrated good inter-pathologist reliability in HTFM measurement (κ = 0.81). Spearman rank correlation coefficients found negligible correlation between artifacts and HTFM SDs ( r ≤ 0.3). These data confirm that although histologic artifacts commonly occur in inked margin specimens, artifacts are not meaningfully associated with variation in HTFM measurements. Investigators can use the grading scheme presented herein to identify artifacts associated with tissue processing.


2018 ◽  
Vol 21 (3) ◽  
pp. 410-420 ◽  
Author(s):  
Marianna S. Wetherill ◽  
Mary B. Williams ◽  
Tori Taniguchi ◽  
Alicia L. Salvatore ◽  
Tvli Jacob ◽  
...  

In rural American Indian (AI) communities, where supermarkets are rare, tribally owned and operated convenience stores are an important food source. Food environment measures for these settings are needed to understand and address the significant diet-related disparities among AIs. Through a tribal-university partnership that included tribal health and commerce representatives from two Native Nations in rural southeastern Oklahoma, we developed the Nutrition Environment Measures Survey for Tribal Convenience Stores (NEMS-TCS) to inform the development and evaluation of a healthy food retail intervention. The NEMS-TCS assessed four scored domains of the rural convenience store food environment—food availability, pricing, quality, and placement—and included 11 food categories that emphasized ready-to-eat food items. Trained raters administered the NEMS-TCS using a sample of 18 rural convenience stores (primarily ranging between 2,400 and 3,600 square feet). We assessed interrater reliability with kappa statistics for dichotomized variables and intraclass correlation coefficients (ICC) for continuous variables. NEMS-TCS demonstrated high inter-rater reliability for all food categories (>85% agreement), subscores (ICC = 0.73-1.00), and the total score (ICC = 0.99). The NEMS-TCS responds to recent calls for reliable measures for rural food environments and may be valuable for studying food environments of large convenience stores in other Native Nations as well as other rural settings.


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Ana Queralt ◽  
Javier Molina-García ◽  
Marta Terrón-Pérez ◽  
Ester Cerin ◽  
Anthony Barnett ◽  
...  

Abstract Background Microscale environmental features are usually evaluated using direct on-street observations. This study assessed inter-rater reliability of the Microscale Audit of Pedestrian Streetscapes, Global version (MAPS-Global), in an international context, comparing on-street with more efficient online observation methods in five countries with varying levels of walkability. Methods Data were collected along likely walking routes of study participants, from residential starting points toward commercial clusters in Melbourne (Australia), Ghent (Belgium), Curitiba (Brazil), Hong Kong (China), and Valencia (Spain). In-person on the street and online using Google Street View audits were carried out by two independent trained raters in each city. The final sample included 349 routes, 1228 street segments, 799 crossings, and 16 cul-de-sacs. Inter-rater reliability analyses were performed using Kappa statistics or Intraclass Correlation Coefficients (ICC). Results Overall mean assessment times were the same for on-street and online evaluations (22 ± 12 min). Only a few subscales had Kappa or ICC values < 0.70, with aesthetic and social environment variables having the lowest overall reliability values, though still in the “good to excellent” category. Overall scores for each section (route, segment, crossing) showed good to excellent reliability (ICCs: 0.813, 0.929 and 0.885, respectively), and the MAPS-Global grand score had excellent reliability (ICC: 0.861) between the two methods. Conclusions MAPS-Global is a feasible and reliable instrument that can be used both on-street and online to analyze microscale environmental characteristics in diverse international urban settings.


1991 ◽  
Vol 34 (5) ◽  
pp. 989-999 ◽  
Author(s):  
Stephanie Shaw ◽  
Truman E. Coggins

This study examines whether observers reliably categorize selected speech production behaviors in hearing-impaired children. A group of experienced speech-language pathologists was trained to score the elicited imitations of 5 profoundly and 5 severely hearing-impaired subjects using the Phonetic Level Evaluation (Ling, 1976). Interrater reliability was calculated using intraclass correlation coefficients. Overall, the magnitude of the coefficients was found to be considerably below what would be accepted in published behavioral research. Failure to obtain acceptably high levels of reliability suggests that the Phonetic Level Evaluation may not yet be an accurate and objective speech assessment measure for hearing-impaired children.


Author(s):  
Marcos A Soriano ◽  
G Gregory Haff ◽  
Paul Comfort ◽  
Francisco J Amaro-Gahete ◽  
Antonio Torres-González ◽  
...  

The aims of this study were to (I) determine the differences and relationship between the overhead press and split jerk performance in athletes involved in weightlifting training, and (II) explore the magnitude of these differences in one-repetition maximum (1RM) performances between sexes. Sixty-one men (age: 30.4 ± 6.7 years; height: 1.8 ± 0.5 m; body mass 82.5 ± 8.5 kg; weightlifting training experience: 3.7 ± 3.5 yrs) and 21 women (age: 29.5 ± 5.2 yrs; height: 1.7 ± 0.5 m; body mass: 62.6 ± 5.7 kg; weightlifting training experience: 3.0 ± 1.5 yrs) participated. The 1RM performance of the overhead press and split jerk were assessed for all participants, with the overhead press assessed on two occasions to determine between-session reliability. The intraclass correlation coefficients (ICC) and 95% confidence intervals showed a high reliability for the overhead press ICC = 0.98 (0.97 – 0.99). A very strong correlation and significant differences were found between the overhead press and split jerk 1RM performances for all participants (r = 0.90 [0.93 – 0.85], 60.2 ± 18.3 kg, 95.7 ± 29.3 kg, p ≤ 0.001). Men demonstrated stronger correlations between the overhead press and split jerk 1RM performances (r = 0.83 [0.73-0.90], p ≤ 0.001) compared with women (r = 0.56 [0.17-0.80], p = 0.008). These results provide evidence that 1RM performance of the overhead press and split jerk performance are highly related, highlighting the importance of upper-limb strength in the split jerk maximum performance.


Dysphagia ◽  
2021 ◽  
Author(s):  
Sofie Albinsson ◽  
Lisa Tuomi ◽  
Christine Wennerås ◽  
Helen Larsson

AbstractThe lack of a Swedish patient-reported outcome instrument for eosinophilic esophagitis (EoE) has limited the assessment of the disease. The aims of the study were to translate and validate the Eosinophilic Esophagitis Activity Index (EEsAI) to Swedish and to assess the symptom severity of patients with EoE compared to a nondysphagia control group. The EEsAI was translated and adapted to a Swedish cultural context (S-EEsAI) based on international guidelines. The S-EEsAI was validated using adult Swedish patients with EoE (n = 97) and an age- and sex-matched nondysphagia control group (n = 97). All participants completed the S-EEsAI, the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Oesophageal Module 18 (EORTC QLQ-OES18), and supplementary questions regarding feasibility and demographics. Reliability and validity of the S-EEsAI were evaluated by Cronbach’s alpha and Spearman correlation coefficients between the domains of the S-EEsAI and the EORTC QLQ-OES18. A test–retest analysis of 29 patients was evaluated through intraclass correlation coefficients. The S-EEsAI had sufficient reliability with Cronbach’s alpha values of 0.83 and 0.85 for the “visual dysphagia question” and the “avoidance, modification and slow eating score” domains, respectively. The test–retest reliability was sufficient, with good to excellent intraclass correlation coefficients (0.60–0.89). The S-EEsAI domains showed moderate correlation to 6/10 EORTC QLQ-OES18 domains, indicating adequate validity. The patient S-EEsAI results differed significantly from those of the nondysphagia controls (p < 0.001). The S-EEsAI appears to be a valid and reliable instrument for monitoring adult patients with EoE in Sweden.


Author(s):  
Jens Sörensen ◽  
Jonny Nordström ◽  
Tomasz Baron ◽  
Stellan Mörner ◽  
Sven-Olof Granstam ◽  
...  

Abstract Aim To develop a method for diagnosing left ventricular (LV) hypertrophy from cardiac perfusion 15O-water positron emission tomography (PET). Methods We retrospectively pooled data from 139 subjects in four research cohorts. LV remodeling patterns ranged from normal to severe eccentric and concentric hypertrophy. 15O-water PET scans (n = 197) were performed with three different PET devices. A low-end scanner (66 scans) was used for method development, and remaining scans with newer devices for a blinded evaluation. Dynamic data were converted into parametric images of perfusable tissue fraction for semi-automatic delineation of the LV wall and calculation of LV mass (LVM) and septal wall thickness (WT). LVM and WT from PET were compared to cardiac magnetic resonance (CMR, n = 47) and WT to 2D-echocardiography (2DE, n = 36). PET accuracy was tested using linear regression, Bland–Altman plots, and ROC curves. Observer reproducibility were evaluated using intraclass correlation coefficients. Results High correlations were found in the blinded analyses (r ≥ 0.87, P < 0.0001 for all). AUC for detecting increased LVM and WT (> 12 mm and > 15 mm) was ≥ 0.95 (P < 0.0001 for all). Reproducibility was excellent (ICC ≥ 0.93, P < 0.0001). Conclusion 15O-water PET might detect LV hypertrophy with high accuracy and precision.


Sign in / Sign up

Export Citation Format

Share Document