Weighted Specific-Category Kappa Measure of Interobserver Agreement

2003 ◽  
Vol 93 (3_suppl) ◽  
pp. 1283-1290 ◽  
Author(s):  
Tarald O. Kvålseth

When two observers classify a sample of items using the same categorical scale, and when different disagreements are differentially weighted, the weighted Kappa ( Kw) by Cohen may serve as a measure of interobserver agreement. We propose a Kappa-based weighted measure ( Kws) of agreement on some specific category s, with Kw being a weighted average of all Kwss. Therefore, while Cohen's Kw is a summary measure of the overall agreement, the proposed Kws provides a measure of the extent to which the observers agree on the specific categories, with both measures being suitable for ordinal categories because of the weights being used. Statistical inferences for Kws and its unweighted counterpart are also discussed. A numerical example is provided.

2018 ◽  
Vol 11 (3) ◽  
pp. 204-209 ◽  
Author(s):  
Paul J Cagle ◽  
Birgit Werner ◽  
Dave R Shukla ◽  
Daniel A London ◽  
Bradford O Parsons ◽  
...  

Background Glenoid morphology, glenoid version and humeral head subluxation represent important parameters for the treating physician. The most common method of assessing glenoid morphology is the Walch classification which has only been validated with computed tomography (CT). Methods CT images and magnetic resonance imaging (MRI) images of 25 patients were de-identified and randomized. Three reviewers assessed the images for each parameter twice. The Walch classification was assessed with a weighted kappa value. Glenoid version and humeral head subluxation were comparted with a reproducibility coefficient. Results The Walch classification demonstrated almost perfect intraobserver agreement for MRI and CT images (k = 0.87). Weighted interobserver agreement values for the Walch classification were fair for CT and MRI (k = 0.34). The weighted reproducibility coefficient for glenoid version measured 9.13 (CI 7.16–12.60) degrees for CT and 13.44 (CI 10.54–18.55) degrees for MRI images. The weighted reproducibility coefficient for percentage of humeral head subluxation was 17.43% (CI 13.67–24.06) for CT and 18.49% (CI 14.5–25.52) for MRI images. Discussion CT and MRI images demonstrated similar efficacy in classifying glenoid morphology, measuring glenoid version and measuring posterior humeral head subluxation. MRI can be used as an alternative to CT for measuring these parameters.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Carsyn Kranz ◽  
Omer Saeed ◽  
Vitalis Osuji ◽  
Evan Fogel ◽  
Nicholas Zyromski ◽  
...  

Background and Hypothesis: Diagnosis of chronic pancreatitis (CP) is challenging and controversial. Magnetic Resonance Imaging (MRI) offers a noninvasive modality to diagnose CP, but its findings have been rarely correlated with histopathology. We aimed to assess the correlation of T1 signal intensity ratio of pancreas to spleen (T1 SIRp/s) and arterio-venous ratio (AVR) of the parenchyma on MRI and Cambridge score on MRCP with surgical histopathology in patients who underwent pancreatic resection.  Methods: We identified 160 pancreatic resections performed in adults between 2017 and 2019 by searching our institution’s surgery database. Seventy-one of them had surgical pathology specimens available and 59 of them had MRI/MRCP within 3 months prior to the surgery. Histologic grading was performed by a gastrointestinal pathologist using Ammann’s fibrosis score. Two image analysts blinded to the clinical information and fibrosis score measured T1 SIRp/s from unenhanced T1-weighted fat-saturated gradient-echo images and arterio-venous ratio (AVR) from post-contrast dynamic phase. Cambridge score was also recorded from MRCP. Statistical analysis included Pearson’s correlation coefficient of the T1 SIRp/s, AVR, and Cambridge score with the fibrosis score and weighted kappa for interobserver agreement.  Results: Correlations between T1 SIRp/s and AVR with the fibrosis score were (r= -0.30, p=0.02, 95%CI: -0.51 to -0.04 and r= -0.36, p=0.01, 95%CI: -0.58 to -0.09, respectively). In comparison, there is less correlation between the Cambridge grade and the fibrosis (r= 0.17, p=0.15, 95% CI for r= -0.07 to 0.39). Interobserver agreement was good (kappa=0.80).  Conclusion: There is moderate correlation between the T1 signal intensity and enhancement ratio of the pancreas with pancreatic fibrosis. This is higher than the correlation between the Cambridge grade and fibrosis. Multi-institutional, prospective studies are needed to verify T1 SIR and AVR as potential imaging biomarkers of pancreatic fibrosis. 


2014 ◽  
Vol 2014 ◽  
pp. 1-9 ◽  
Author(s):  
Jae Wook Han ◽  
Soon Young Cho ◽  
Kui Dong Kang

Purpose. To compare stereometric parameters obtained by three-dimensional (3D) optic disc photography and optical coherence tomography (OCT) and assess interobserver agreement on the disc damage likelihood scale (DDLS).Methods. This retrospective study included 190 eyes from 190 patients classified as normal, glaucoma suspect, or glaucomatous. Residents at different levels of training completed the DDLS for each patient before and after attending a training module. 3D optic disc photography and OCT were performed on each eye, and correlations between the DDLS and various parameters obtained by each device were calculated.Results. We found moderate agreement (weighted kappa value, 0.59 ± 0.03) between DDLS scores obtained by 3D optic disc photography and the glaucoma specialist. The weighted kappa values for agreement and interobserver concordance increased among residents after the training module. Interobserver concordance was the poorest at DDLS stages 5 and 6. The DDLS scored by the glaucoma specialist had the highest predictability value (0.941).Conclusions. The DDLS obtained by 3D optic disc photography is a useful diagnostic tool for glaucoma. A supervised teaching program increased trainee interobserver agreement on the DDLS. DDLS stages 5 and 6 showed the poorest interobserver agreement, suggesting that caution is required when recording these stages.


2019 ◽  
pp. 216847901987405 ◽  
Author(s):  
Russell Reeve ◽  
Klaus Gottlieb

Background: Cohen's kappa is a statistic that estimates interobserver agreement. It was originally introduced to help develop diagnostic tests. Interpretative readings of 2 observers, for example, of a mammogram or other imaging, were compared at a single point in time. It is known that kappa depends on the prevalence of disease and that, therefore, kappas across different settings are hard to compare. Methods: Using simulation, we examine an analogous situation, not previously described, that occurs in clinical trials where sequential measurements are obtained to evaluate disease progression or clinical improvement over time. Results: We show that weighted kappa, used for multilevel outcomes, changes during the trial even if we keep the performance of the observer constant. Conclusions: Kappa and closely related measures can therefore only be used with great difficulty, if at all, in quality assurance in clinical trials.


Author(s):  
XIAO-JUN YANG ◽  
LUAN ZENG ◽  
RAN ZHANG

Group decision making is an important category of problem solving techniques for complicated problems, among which the Delphi method has been widely applied. In this paper an improved Delphi method based on Cloud model is proposed in order to deal with the fuzziness and uncertainty in experts' subjective judgments. The proposed Cloud Delphi Method (CDM) describes experts' opinions by Cloud model and we aggregate the experts' Cloud opinions by synthetic algorithm and weighted average algorithm. Another key point of CDM is to stabilize and accommodate the individual fuzzy estimates by the defined stability rules rather than having to force them to converge, or reduce. The Cloud opinions and aggregation results can be exhibited in a graphically way leading experts to judge intuitively and it can decrease the number of repetitive surveys and/or interviews. Moreover, it is more scientific and easier to represent experts' opinion base on Cloud model which can combine fuzziness and uncertainty well. A numerical example is examined to demonstrate applicability and implementation process of CDM.


Sign in / Sign up

Export Citation Format

Share Document