Kappa as a Parameter of a Symmetry Model for Rater Agreement

2001 ◽  
Vol 26 (3) ◽  
pp. 331-342 ◽  
Author(s):  
Christof Schuster

If two raters assign targets to categories, the ratings can be arranged in a two-dimensional contingency table. A model for the frequencies in such a contingency table is presented for which Cohen’s kappa is a parameter. The model is based on two assumptions. First, the joint classification probabilities for the raters satisfy symmetry; second, the ratio of observed agreement to chance agreement is constant across categories. The model is illustrated using data from a study of the psychobiology of depression.

Sensors ◽  
2019 ◽  
Vol 19 (13) ◽  
pp. 2845 ◽  
Author(s):  
Michael B. Del Del Rosario ◽  
Nigel H. Lovell ◽  
Stephen J. Redmond

Features were developed which accounted for the changing orientation of the inertial measurement unit (IMU) relative to the body, and demonstrably improved the performance of models for human activity recognition (HAR). The method is proficient at separating periods of standing and sedentary activity (i.e., sitting and/or lying) using only one IMU, even if it is arbitrarily oriented or subsequently re-oriented relative to the body; since the body is upright during walking, learning the IMU orientation during walking provides a reference orientation against which sitting and/or lying can be inferred. Thus, the two activities can be identified (irrespective of the cohort) by analyzing the magnitude of the angle of shortest rotation which would be required to bring the upright direction into coincidence with the average orientation from the most recent 2.5 s of IMU data. Models for HAR were trained using data obtained from a cohort of 37 older adults (83.9 ± 3.4 years) or 20 younger adults (21.9 ± 1.7 years). Test data were generated from the training data by virtually re-orienting the IMU so that it is representative of carrying the phone in five different orientations (relative to the thigh). The overall performance of the model for HAR was consistent whether the model was trained with the data from the younger cohort, and tested with the data from the older cohort after it had been virtually re-oriented (Cohen’s Kappa 95% confidence interval [0.782, 0.793]; total class sensitivity 95% confidence interval [84.9%, 85.6%]), or the reciprocal scenario in which the model was trained with the data from the older cohort, and tested with the data from the younger cohort after it had been virtually re-oriented (Cohen’s Kappa 95% confidence interval [0.765, 0.784]; total class sensitivity 95% confidence interval [82.3%, 83.7%]).


2021 ◽  
Vol 8 (Supplement_1) ◽  
pp. S297-S297
Author(s):  
Gabrielle Gussin ◽  
Raveena Singh ◽  
Izabela Coimbra Ibraim ◽  
Raheeb Saavedra ◽  
Thomas Tjoa ◽  
...  

Abstract Background Federal mandate requires NHs to perform weekly COVID-19 testing of staff. Testing is effective due to barriers to disclosing mild illness, but it is unclear how long the mandate will last. We explored if environmental samples can be used to signal staff COVID-19 cases as an alternative screening tool in NHs. Methods We conducted a cross sectional study to assess the value of environmental sampling as a trigger for COVID-19 testing of NH staff using data from currently performed weekly staff sweeps. We performed 35 sampling sweeps across 21 NHs from 6/2020-2/2021. For each sweep, we sampled up to 24 high touch objects in NH breakrooms (N=226), entryways (N=216), and nursing stations (N=194) assuming that positive samples were due to contamination from infected staff. Total staff and positive staff counts were tallied for the staff testing sweeps performed the week of and week prior to environmental sampling. Object samples were processed for SARS-CoV-2 using PCR (StepOnePlus) with a 1 copy/mL limit of detection. We evaluated concordance between object and staff positivity using Cohen’s kappa and calculated the positive and negative predictive value (PPV, NPV) of environmental sweeps for staff positivity, including the attributable capture of positive staff. We tested the association between the proportion of staff positivity and object contamination by room type in a linear regression model when clustering by NH. Results Among 35 environmental sweeps, 49% had SARS-CoV-2 positive objects and 69% had positive staff in the same or prior week. Mean positivity was 16% (range 0-83%) among objects and 4% (range 0-22%) among staff. Overall, NPV was 61% and Cohen’s kappa was 0.60. PPV of object sampling as an indicator of positive staff was 100% for every room type, with an attributable capture of positive staff of 76%, with values varying by room type (Table). Breakroom samples were the strongest indicator of any staff cases. Each percent increase in object positivity was associated with an increase in staff positivity in entryways (7.2% increased staff positivity, P=0.01) and nursing stations (5.7% increased staff positivity, P=0.05). Conclusion If mandatory weekly staff testing ends in NHs, environmental sampling may serve as an effective tool to trigger targeted COVID-19 testing sweeps of NH staff. Disclosures Gabrielle Gussin, MS, Medline (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic and cleaning products)Stryker (Sage) (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic products)Xttrium (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic products) Raveena Singh, MA, Medline (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic and cleaning products)Stryker (Sage) (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic products)Xttrium (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic products) Raheeb Saavedra, AS, Medline (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic and cleaning products)Stryker (Sage) (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic products)Xttrium (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic products) Susan S. Huang, MD, MPH, Medline (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic and cleaning products)Molnlycke (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic and cleaning products)Stryker (Sage) (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic and cleaning products)Xttrium (Other Financial or Material Support, Conducted studies in which participating hospitals and nursing homes received contributed antiseptic and cleaning products)


2019 ◽  
Vol 14 ◽  
Author(s):  
Andrea Smargiassi ◽  
Riccardo Inchingolo ◽  
Marco Chiappetta ◽  
Leonardo Petracca Ciavarella ◽  
Stefania Lopatriello ◽  
...  

Background: Chest Ultrasonography (chest US) has shown good sensibility in detecting pneumothorax, pleural effusions and peripheral consolidations and it can be performed bedside. Objectives: The aim of the study was to analyze agreement between chest US and chest X-ray in patients who have undergone thoracic surgery and discuss cases of discordance. Methods: Patients undergoing thoracic surgery were retrospectively selected. Patients underwent routinely Chest X-ray (CXR) during the first 48 h after surgery. Chest US have been routinely performed in all selected patients in the same date of CXR. Chest US operators were blind to both reports and images of CXR. Ultrasonographic findings regarding pneumothorax (PNX), subcutaneous emphysema (SCE), lung consolidations (LC), pleural effusions (PE) and hemi-diaphragm position were collected and compared to corresponding CXR findings. Inter-rater agreement between two techniques was determined by Cohen’s kappa-coefficient. Results: Twenty-four patients were selected. Inter-rater agreement showed a moderate magnitude for PNX (Cohen’s Kappa 0.5), a slight/fair magnitude for SCE (Cohen’s Kappa 0.21), a fair magnitude for PE (Cohen’s Kappa 0.39), no agreement for LCs (Cohen’s Kappa 0.06), high levels of agreement for position of hemi-diaphragm (Cohen’s Kappa 0.7). Conclusion: Analysis of agreement between chest X-ray and chest US showed that ultrasonography is able to detect important findings for surgeons. Limitations and advantages have been found for both chest X-ray and chest US. Knowing the limits of each one is important to really justify and optimize the use of ionizing radiations.


2019 ◽  
Vol 27 (2) ◽  
pp. 152-161
Author(s):  
Shirley Moore Waugh ◽  
Jianghua He

Background and PurposeEfforts to establish support for the reliability of quality indicator data are ongoing. Most patients typically receive recommended care, therefore, the high-prevalence of event rates make statistical analysis challenging. This article presents a novel statistical approach recently used to estimate inter-rater agreement for the National Database for Nursing Quality Indicator pressure injury risk and prevention data.MethodsInter-rater agreement was estimated by prevalence-adjusted kappa values. Data modifications were also done to overcome the convergence issue due to sparse cross-tables.ResultsCohen's kappa values suggested low reliability despite high levels of agreement between raters.ConclusionPrevalence-adjusted kappa values should be presented with Cohen's kappa values in order to evaluate inter-rater agreement when the majority of patients receive recommended care.


2020 ◽  
Author(s):  
Wagner Diniz de Paula ◽  
Marcelo Palmeira Rodrigues ◽  
Nathali Mireise Costa Ferreira ◽  
Viviane Vieira Passini ◽  
César Augusto Melo e Silva

Abstract BackgroundHigh-resolution chest computed tomography (HRCT) signs of interstitial lung disease (ILD) are varied, some corresponding to irreparable parenchymal destruction and fibrosis, others representing potentially reversible changes, such as fine reticulation and ground-glass opacities (GGO). GGO frequently correspond to sites of active inflammation that may be responsive to steroids or immunosuppressive agents, but they might also represent intralobular interstitial fibrosis not resolved by current HRCT technique. Our aim was to investigate the ability of lung MRI to predict treatment response in individuals with ILD presenting with predominant GGO.MethodsIn this prospective cohort, 15 participants (4 male and 11 female) aged 38–84 years, presenting with ILD manifested as predominant GGO and referred for a new treatment regimen with a systemic glucocorticoid and/or an immunosuppressive agent, underwent 1.5 T lung MRI with breath-hold (SSFSE) and respiratory-gated (PROPELLER) T2-weighted pulse sequences, and with dynamic contrast-enhanced fat-suppressed T1-weighted pulse sequence (LAVA). Relative signal intensity on T2-weighted images and relative enhancement of lung lesions were compared to functional response in a dichotomous fashion (response versus non-response) with t test for independent samples. SSFSE/PROPELLER T2 mismatch was compared to response with Fisher’s exact test. Inter-rater agreement was evaluated with Cohen’s kappa coefficient. The primary endpoint for response was a greater than 10% increase in forced vital capacity in 10 weeks.ResultsResponders (4/15, 27%) and non-responders (11/15, 73%) showed similar relative signal intensity on T2-weighted images and relative enhancement measurements. SSFSE/PROPELLER T2 mismatch was able to discriminate responders from non-responders in 12 of 15 participants (80% accuracy, p = 0.026) for readers 1 and 2, and in 13 of 15 participants (87% accuracy, p = 0.011) for reader 3, with inter-rater agreement of 87% between readers 1 and 2 (Cohen’s kappa coefficient of 0.732) and 93% between readers 1/2 and 3 (Cohen’s kappa coefficient of 0.865).ConclusionsSSFSE-PROPELLER T2 mismatch was predictive of lack of response to treatment in this small group of ILD patients presenting with predominant GGO at HRCT.Key PointSSFSE/PROPELLER T2 mismatch may help predict lack of response to anti-inflammatory/immunosuppressive treatment in interstitial lung disease, with high accuracy and high inter-rater agreement.


2018 ◽  
Vol 7 (2) ◽  
pp. 205846011875460 ◽  
Author(s):  
Fredrik Jäderling ◽  
Tommy Nyberg ◽  
Michael Öberg ◽  
Stefan Carlsson ◽  
Mikael Skorpil ◽  
...  

Background The evidence supporting the use of magnetic resonance imaging (MRI) in prostate cancer detection has been established, but its accuracy in local staging is questioned. Purpose To investigate the additional value of multi-planar radial reconstructions of a three-dimensional (3D) T2-weighted (T2W) MRI sequence, intercepting the prostate capsule perpendicularly, for improving local staging of prostate cancer. Material and Methods Preoperative, bi-parametric prostate MRI examinations in 94 patients operated between June 2014 and January 2015 where retrospectively reviewed by two experienced abdominal radiologists. Each patient was presented in two separate sets including diffusion-weighted imaging, without and with the 3D T2W set that included radial reconstructions. Each set was read at least two months apart. Extraprostatic tumor extension (EPE) was assessed according to a 5-point grading scale. Sensitivity and specificity for EPE was calculated and presented as receiver operating characteristics (ROC) with area under the curve (AUC), using histology from whole-mount prostate specimen as gold standard. Inter-rater agreement was calculated for the two different reading modes using Cohen’s kappa. Results The AUC for detection of EPE for Readers 1 and 2 in the two-dimensional (2D) set was 0.70 and 0.68, respectively, and for the 2D + 3D set 0.62 and 0.65, respectively. Inter-rater agreement (Reader 1 vs. Reader 2) on EPE using Cohen’s kappa for the 2D and 2D + 3D set, respectively, was 0.42 and 0.17 (i.e. moderate and poor agreement, respectively). Conclusion The addition of 3D T2W MRI with radial reconstructions did not improve local staging in prostate cancer.


Author(s):  
Miriam Athmann ◽  
Roya Bornhütter ◽  
Nicolaas Busscher ◽  
Paul Doesburg ◽  
Uwe Geier ◽  
...  

AbstractIn the image forming methods, copper chloride crystallization (CCCryst), capillary dynamolysis (CapDyn), and circular chromatography (CChrom), characteristic patterns emerge in response to different food extracts. These patterns reflect the resistance to decomposition as an aspect of resilience and are therefore used in product quality assessment complementary to chemical analyses. In the presented study, rocket lettuce from a field trial with different radiation intensities, nitrogen supply, biodynamic, organic and mineral fertilization, and with or without horn silica application was investigated with all three image forming methods. The main objective was to compare two different evaluation approaches, differing in the type of image forming method leading the evaluation, the amount of factors analyzed, and the deployed perceptual strategy: Firstly, image evaluation of samples from all four experimental factors simultaneously by two individual evaluators was based mainly on analyzing structural features in CapDyn (analytical perception). Secondly, a panel of eight evaluators applied a Gestalt evaluation imbued with a kinesthetic engagement of CCCryst patterns from either fertilization treatments or horn silica treatments, followed by a confirmatory analysis of individual structural features. With the analytical approach, samples from different radiation intensities and N supply levels were identified correctly in two out of two sample sets with groups of five samples per treatment each (Cohen’s kappa, p = 0.0079), and the two organic fertilizer treatments were differentiated from the mineral fertilizer treatment in eight out of eight sample sets with groups of three manure and two minerally fertilized samples each (Cohen’s kappa, p = 0.0048). With the panel approach based on Gestalt evaluation, biodynamic fertilization was differentiated from organic and mineral fertilization in two out of two exams with 16 comparisons each (Friedman test, p < 0.001), and samples with horn silica application were successfully identified in two out of two exams with 32 comparisons each (Friedman test, p < 0.001). Further research will show which properties of the food decisive for resistance to decomposition are reflected by analytical and Gestalt criteria, respectively, in CCCryst and CapDyn images.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Alexandre Maciel-Guerra ◽  
Necati Esener ◽  
Katharina Giebel ◽  
Daniel Lea ◽  
Martin J. Green ◽  
...  

AbstractStreptococcus uberis is one of the leading pathogens causing mastitis worldwide. Identification of S. uberis strains that fail to respond to treatment with antibiotics is essential for better decision making and treatment selection. We demonstrate that the combination of supervised machine learning and matrix-assisted laser desorption ionization/time of flight (MALDI-TOF) mass spectrometry can discriminate strains of S. uberis causing clinical mastitis that are likely to be responsive or unresponsive to treatment. Diagnostics prediction systems trained on 90 individuals from 26 different farms achieved up to 86.2% and 71.5% in terms of accuracy and Cohen’s kappa. The performance was further increased by adding metadata (parity, somatic cell count of previous lactation and count of positive mastitis cases) to encoded MALDI-TOF spectra, which increased accuracy and Cohen’s kappa to 92.2% and 84.1% respectively. A computational framework integrating protein–protein networks and structural protein information to the machine learning results unveiled the molecular determinants underlying the responsive and unresponsive phenotypes.


Author(s):  
Maximilian Lutz ◽  
Martin Möckel ◽  
Tobias Lindner ◽  
Christoph J. Ploner ◽  
Mischa Braun ◽  
...  

Abstract Background Management of patients with coma of unknown etiology (CUE) is a major challenge in most emergency departments (EDs). CUE is associated with a high mortality and a wide variety of pathologies that require differential therapies. A suspected diagnosis issued by pre-hospital emergency care providers often drives the first approach to these patients. We aim to determine the accuracy and value of the initial diagnostic hypothesis in patients with CUE. Methods Consecutive ED patients presenting with CUE were prospectively enrolled. We obtained the suspected diagnoses or working hypotheses from standardized reports given by prehospital emergency care providers, both paramedics and emergency physicians. Suspected and final diagnoses were classified into I) acute primary brain lesions, II) primary brain pathologies without acute lesions and III) pathologies that affected the brain secondarily. We compared suspected and final diagnosis with percent agreement and Cohen’s Kappa including sub-group analyses for paramedics and physicians. Furthermore, we tested the value of suspected and final diagnoses as predictors for mortality with binary logistic regression models. Results Overall, suspected and final diagnoses matched in 62% of 835 enrolled patients. Cohen’s Kappa showed a value of κ = .415 (95% CI .361–.469, p < .005). There was no relevant difference in diagnostic accuracy between paramedics and physicians. Suspected diagnoses did not significantly interact with in-hospital mortality (e.g., suspected class I: OR .982, 95% CI .518–1.836) while final diagnoses interacted strongly (e.g., final class I: OR 5.425, 95% CI 3.409–8.633). Conclusion In cases of CUE, the suspected diagnosis is unreliable, regardless of different pre-hospital care providers’ qualifications. It is not an appropriate decision-making tool as it neither sufficiently predicts the final diagnosis nor detects the especially critical comatose patient. To avoid the risk of mistriage and unnecessarily delayed therapy, we advocate for a standardized diagnostic work-up for all CUE patients that should be triggered by the emergency symptom alone and not by any suspected diagnosis.


Sign in / Sign up

Export Citation Format

Share Document