measures of agreement
Recently Published Documents


TOTAL DOCUMENTS

64
(FIVE YEARS 9)

H-INDEX

14
(FIVE YEARS 1)

2022 ◽  
Vol 29 (1) ◽  
pp. 1-70
Author(s):  
Radu-Daniel Vatavu ◽  
Jacob O. Wobbrock

We clarify fundamental aspects of end-user elicitation, enabling such studies to be run and analyzed with confidence, correctness, and scientific rigor. To this end, our contributions are multifold. We introduce a formal model of end-user elicitation in HCI and identify three types of agreement analysis: expert , codebook , and computer . We show that agreement is a mathematical tolerance relation generating a tolerance space over the set of elicited proposals. We review current measures of agreement and show that all can be computed from an agreement graph . In response to recent criticisms, we show that chance agreement represents an issue solely for inter-rater reliability studies and not for end-user elicitation, where it is opposed by chance disagreement . We conduct extensive simulations of 16 statistical tests for agreement rates, and report Type I errors and power. Based on our findings, we provide recommendations for practitioners and introduce a five-level hierarchy for elicitation studies.


PLoS ONE ◽  
2021 ◽  
Vol 16 (12) ◽  
pp. e0261546
Author(s):  
Sam D. Hutchings ◽  
Jim Watchorn ◽  
Rory McDonald ◽  
Su Jeffreys ◽  
Mark Bates ◽  
...  

Introduction Haemorrhage is a leading cause of death following traumatic injury and the early detection of hypovolaemia is critical to effective management. However, accurate assessment of circulating blood volume is challenging when using traditional vital signs such as blood pressure. We conducted a study to compare the stroke volume (SV) recorded using two devices, trans-thoracic electrical bioimpedance (TEB) and supra-sternal Doppler (SSD), against a reference standard using trans- thoracic echocardiography (TTE). Methods A lower body negative pressure (LBNP) model was used to simulate hypovolaemia and in half of the study sessions lower limb tourniquets were applied as these are common in military practice and can potentially affect some haemodynamic monitoring systems. In order to provide a clinically relevant comparison we constructed an error grid alongside more traditional measures of agreement. Results 21 healthy volunteers aged 18–40 were enrolled and underwent 2 sessions of LBNP, with and without lower limb tourniquets. With respect to absolute SV values Bland Altman analysis showed significant bias in both non-tourniquet and tourniquet strands for TEB (-42.5 / -49.6 ml), rendering further analysis impossible. For SSD bias was minimal but percentage error was unacceptably high (35% / 48%). Degree of agreement for dynamic change in SV, assessed using 4 quadrant plots showed a seemingly acceptable concordance rate for both TEB (86% / 93%) and SSD (90% / 91%). However, when results were plotted on an error grid, constructed based on expert clinical opinion, a significant minority of measurement errors were identified that had potential to lead to moderate or severe patient harm. Conclusion Thoracic bioimpedance and suprasternal Doppler both demonstrated measurement errors that had the potential to lead to clinical harm and caution should be applied in interpreting the results in the detection of early hypovolaemia following traumatic injury.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Maja Mrevlje ◽  
Manca Oblak ◽  
Gregor Mlinšek ◽  
Jelka Lindič ◽  
Jadranka-Buturović-Ponikvar ◽  
...  

Abstract Background Quantification of proteinuria in kidney transplant recipients is important for diagnostic and prognostic purposes. Apart from correlation tests, there have been few evaluations of spot urine protein measurements in kidney transplantation. Methods In this cross-sectional study involving 151 transplanted patients, we investigated measures of agreement (bias and accuracy) between the estimated protein excretion rate (ePER), determined from the protein-to-creatinine ratio in the first and second morning urine, and 24-h proteinuria and studied their performance at different levels of proteinuria. Measures of agreement were reanalyzed in relation to allograft histology in 76 patients with kidney biopsies performed for cause before enrolment in the study. Results For ePER in the first morning urine, percent bias ranged from 1 to 28% and accuracy (within 30% of 24-h collection) ranged from 56 to 73%. For the second morning urine, percent bias ranged from 2 to 11%, and accuracy ranged from 71 to 78%. The accuracy of ePER (within 30%) in first and second morning urine progressively increased from 56 and 71% for low-grade proteinuria (150–299 mg/day) to 60 and 74% for moderate proteinuria (300–999 mg/day), and to 73 and 78% for high-grade proteinuria (≥1000 mg/day). Measures of agreement were similar across histologic phenotypes of allograft injury. Conclusions The ability of ePER to accurately predict 24-h proteinuria in kidney transplant recipients is modest. However, accuracy improves with an increase in proteinuria. Given the similar accuracy of ePER measurements in first and second morning urine, second morning urine can be used to monitor protein excretion.


2021 ◽  
Author(s):  
Maja Mrevlje ◽  
Manca Oblak ◽  
Gregor Mlinšek ◽  
Jadranka Buturović-Ponikvar ◽  
Jelka Lindič ◽  
...  

Abstract Background. Quantification of proteinuria in kidney transplant recipients is important for diagnostic and prognostic purposes. Apart from correlation tests, there have been few evaluations of spot urine protein measurements in kidney transplantation.Methods. In this cross-sectional study involving 151 transplanted patients, we investigated measures of agreement (bias and accuracy) between the estimated protein excretion rate (ePER), determined from the protein-to-creatinine ratio in the first and second morning urine, and 24-hour proteinuria and studied their performance at different levels of proteinuria. Measures of agreement were reanalyzed in relation to allograft histology in 76 patients with kidney biopsies performed for cause before enrolment in the study.Results. For ePER in the first morning urine, percent bias ranged from 1% to 28% and accuracy (within 30% of 24-hour collection) ranged from 56% to 73%. For the second morning urine, percent bias ranged from 2% to 11%, and accuracy ranged from 71% to 78%. The accuracy of ePER (within 30%) in first and second morning urine progressively increased from 56% and 71% for low-grade proteinuria (150-299 mg/day) to 60% and 74% for moderate proteinuria (300-999 mg/day), and to 73% and 78% for high-grade proteinuria (≥1000 mg/day). Measures of agreement were similar across histologic phenotypes of allograft injury.Conclusions. The ability of ePER to accurately predict 24-hour proteinuria in kidney transplant recipients is modest. However, accuracy improves with an increase in proteinuria. Given the similar accuracy of ePER measurements in first and second morning urine, second morning urine can be used to monitor protein excretion.


Animals ◽  
2020 ◽  
Vol 10 (9) ◽  
pp. 1702
Author(s):  
Betty McGuire ◽  
Destiny Orantes ◽  
Stephanie Xue ◽  
Stephen Parry

Some shelters in the United States consider dogs identified as food aggressive during behavioral evaluations to be unadoptable. We surveyed adopters of dogs from a New York shelter to examine predictive abilities of shelter behavioral evaluations and owner surrender profiles. Twenty of 139 dogs (14.4%) were assessed as resource guarding in the shelter. We found statistically significant associations between shelter assessment as resource guarding and guarding reported in the adoptive home for three situations: taking away toys, bones or other valued objects; taking away food; and retrieving items or food taken by the dog. Similarly, owner descriptions of resource guarding on surrender profiles significantly predicted guarding in adoptive homes. However, positive predictive values for all analyses were low, and more than half of dogs assessed as resource guarding either in the shelter or by surrendering owners did not show guarding post adoption. All three sources of information regarding resource guarding status (surrender profile, shelter behavioral evaluation, and adopter report) were available for 44 dogs; measures of agreement were in the fair range. Thus, reports of resource guarding by surrendering owners and detection of guarding during shelter behavioral evaluations should be interpreted with caution because neither source of information consistently signaled guarding would occur in adoptive homes.


2019 ◽  
Vol 29 (3) ◽  
pp. 837-853 ◽  
Author(s):  
Miguel de Carvalho ◽  
Bradley J Barney ◽  
Garritt L Page

We propose new summary measures of biomarker accuracy which can be used as companions to existing diagnostic accuracy measures. Conceptually, our summary measures are tantamount to the so-called Hellinger affinity and we show that they can be regarded as measures of agreement constructed from similar geometrical principles as Pearson correlation. We develop a covariate-specific version of our summary index, which practitioners can use to assess the discrimination performance of a biomarker, conditionally on the value of a predictor. We devise nonparametric Bayes estimators for the proposed indexes, derive theoretical properties of the corresponding priors, and assess the performance of our methods through a simulation study. The proposed methods are illustrated using data from a prostate cancer diagnosis study.


Sign in / Sign up

Export Citation Format

Share Document