scholarly journals AN ALTERNATIVE CHOICE FOR THE CRITICAL VALUE OF LIMITS OF AGREEMENT AND SIMULATION-BASED SAMPLE SIZE CALCULATION IN BLAND ALTMAN ANALYSIS

2019 ◽  
Vol 21 (2) ◽  
pp. 119-137
Author(s):  
Steven B. Kim ◽  
Diagnostics ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2134
Author(s):  
Jörg Philipps ◽  
Hannah Mork ◽  
Maria Katz ◽  
Mark Knaup ◽  
Kira Beyer ◽  
...  

Currently, there is no standardized method to evaluate operator reliability in nerve ultrasound. A short prospective protocol using Bland–Altman analysis was developed to assess the level of agreement between operators with different expertise levels. A control rater without experience in nerve ultrasound, three novices after two months of training, an experienced rater with two years of experience, and a reference rater performed blinded ultrasound examinations of the left median and ulnar nerve in 42 nerve sites in healthy volunteers. The precision of Bland–Altman agreement analysis was tested using the Preiss–Fisher procedure. Intraclass correlation coefficients (ICC), coefficients of variation, and Bland–Altman limits of agreement were calculated. The sample size calculation and Preiss–Fisher procedure showed a sufficient precision of Bland–Altman agreement analysis. Limits of agreement of all trained novices ranged from 2.0 to 2.9 mm2 and were within the test’s maximum tolerated difference. Ninety-five percent confidence intervals of limits of agreement revealed a higher precision in the experienced rater’s measurements. Operator reliability in nerve ultrasound of the median and ulnar nerve arm nerves can be evaluated with a short prospective controlled protocol using Bland–Altman statistics, allowing a clear distinction between an untrained rater, trained novices after two months of training, and an experienced rater.


2017 ◽  
Vol 30 (2) ◽  
pp. 233-237 ◽  
Author(s):  
Heidi E. Banse ◽  
Nichol Schultz ◽  
Molly McCue ◽  
Ray Geor ◽  
Dianne McFarlane

Accurate measurement of equine adrenocorticotropin (ACTH) is important for the diagnosis of equine pituitary pars intermedia dysfunction (PPID). Several radioimmunoassays (RIAs) and chemiluminescent immunoassays (CIAs) are used for measurement of ACTH concentration in horses; whether these methods yield similar results across a range of concentrations is not determined. We evaluated agreement between a commercial RIA and CIA. Archived plasma samples ( n = 633) were measured with both assays. Correlation between the 2 methods was moderate ( r = 0.49, p < 0.001). Bland–Altman analysis revealed poor agreement, with a proportional bias and widening limits of agreement with increasing values. Poor agreement between assays was also observed when evaluating plasma samples with concentrations at or below the recommended diagnostic cutoff value for PPID testing. The lack of agreement suggests that measurements obtained should not be considered interchangeable between methods.


Author(s):  
Mera Usman Muhammed ◽  
Mayaki Abubakar Musa ◽  
Gambo Abdulrahman Abdullahi

This study was carried out to compare the digital rectal (DR) thermometer with non-contact infrared thermometer (IRT) measurements at two locations on the face in some large animal species. Two hundred and forty (240) animals comprising of equal numbers of three species (cattle, camel and horses) of varying age and either sex was used. The IR temperature was taken from two sites [frontal (FIRT) and temporal (TIRT) region] on the animal face. The mean IR temperatures (FIRT and TIRT) were higher than the RT in all the animal species. The two thermometers correlate poorly in all the animal species. Bland-Altman analysis showed high biases and limits of agreement not acceptable for clinical purposes. In conclusion, IRT seems to offer a quick and easy way to determine the animal temperature but clinically it cannot be used interchangeably with DR thermometer at the moment for body temperature measurement in these animal species.


Blood ◽  
2014 ◽  
Vol 124 (21) ◽  
pp. 1605-1605
Author(s):  
Fernanda Gutierrez-Rodrigues ◽  
Bárbara A Santana-Lemos ◽  
Priscila Santos Scheucher ◽  
Raquel M Alves-Paiva ◽  
Rodrigo T. Calado

Abstract Excessive telomere erosion is the molecular etiology of a group of disorders (dyskeratosis congenita, aplastic anemia, idiopathic pulmonary fibrosis) collectively called telomeropathies. Telomere length measurement is an essential diagnostic test for these diseases. The most commonly used methods are terminal restriction fragment (TRF) analysis by Southern blotting (the gold-standard method), flow cytometry combined with fluorescence in situ hybridization (flow-FISH), and quantitative PCR (qPCR). Although the clinical use of these methods has been reported, their utility and characteristics have not been widely compared. Measurement techniques and coefficients of variations often differ among diagnostic services. Here, we directly compared the accuracy, reproducibility, sensitivity, and specificity of flow-FISH and qPCR in comparison to TRF to measure peripheral blood leukocyte’s telomere length in healthy individuals and patients with telomeropathies. TRF analyses and flow-FISH showed good correlation in the analysis of samples from healthy subjects (R2=0.60; p<0.0001) and patients (R2=0.51; p<0.0001). Bland-Altman analyses also displayed a very good agreement between these methods for both healthy individuals (bias±SD = 0.17±1.03; limits of agreement ranging from 2.24 to -1.88) and patients (bias±SD = 0.0±1.21; limits of agreement ranging from 2.41 to -2.41). In contrast, the comparison between TRF and qPCR yielded modest correlation for the analysis of samples of healthy individuals (R2=0.35; p<0.0001) and low correlation for patients (R2=0.20; p=0.001). Bland-Altman analysis indicated poor agreement between the two methods for both patients and controls. The differences averages were very different from zero and standard deviation was wide. For patients, the bias±SD was 0.78±1.34 with limits of agreement ranging from 3.47 to -1.90, and for controls, the bias±SD was 1.15±1.49 with limits of agreement ranging from 4.14 to -1.84. Finally, qPCR and flow-FISH also modestly correlated in the analysis of healthy individual samples (R2=0.33; p<0.0001) and did not correlate in the comparison of patients’ samples (R2=0.1, p=0.08). Bland-Altman analysis corroborate this finding. For controls, the bias±SD were very similar to the one found by comparison between qPCR and TRF analysis (-0.6±1.27; limits of agreement ranging from 1.94 to -3.16). For patients, bias ± SD were -1.15 ± 1.65 with limits of agreement ranging from 2.15 to -4.45, which evidenced a poor agreement between flow-FISH and qPCR in these samples. Intra-assay coefficient of variation (CV) was 10.8±7.1% for flow-FISH and 9.5±7.4% for qPCR (p=0.35). The inter-assay CV was lower for flow-FISH (9.6±7.6%) in comparison to qPCR (16±19.5%; p=0.02). Flow-FISH and qPCR were sensitive (both 100%) and specific (93% and 89%, respectively) to distinguish very short telomeres. However, qPCR sensitivity (40%) and specificity (63%) to detect telomere length below tenth percentile were lower in comparison to flow-FISH (80% sensitivity and 85% specificity). Taken together, these findings indicate that, in the clinical setting, flow-FISH is more accurate and reproducible in the measurement of human leukocyte’s telomere length in comparison to qPCR. Quantitative PCR exhibited low accuracy in the analysis of samples of patients with short telomeres. In conclusion, flow-FISH appears to be a more appropriate method for diagnostic purposes. Studies that compare methodologies are helpful in the selection of standard methods and to narrow the differences among laboratories. Disclosures No relevant conflicts of interest to declare.


2021 ◽  
Author(s):  
Yushui Han ◽  
Ahmed Ibrahim Ahmed ◽  
Chris Schwemmer ◽  
Myra Cocker ◽  
Talal S Alnabelsi ◽  
...  

Abstract Background: Advances in computed tomography (CT) and machine learning have enabled on-site non-invasive assessment of fractional flow reserve (FFRCT). Purpose: To assess the inter-operator variability of Coronary CT Angiography–derived FFRCT using a machine learning based post-processing prototype.Materials and Methods: We included 60 symptomatic patients who underwent coronary CT angiography. FFRCT was calculated by 2 independent operators after training using a machine learning based on-site prototype. FFRCT was measured 1 cm distal to the coronary plaque or in the middle of the segments if no coronary lesions were present. Intraclass correlation coefficient (ICC) and Bland-Altman analysis were used to evaluate inter-operator variability effect in FFRCT estimates. Sensitivity analysis was done by cardiac risk factors, degree of stenosis and image quality. Results: A total of 535 coronary segments in 60 patients were assessed. The overall ICC was 0.986 per patient (95% CI: 0.977 - 0.992) and 0.972 per segment (95% CI: 0.967 - 0.977). The absolute mean difference in FFRCT estimates was 0.012 per patient (95% CI for limits of agreement: -0.035 - 0.039) and 0.02 per segment (95% CI for limits of agreement: -0.077 - 0.080). Tight limits of agreement were seen on Bland-Altman analysis. Distal segments had greater variability compared to proximal/mid segments (absolute mean difference 0.011 vs 0.025, p<0.001). Results were similar on sensitivity analysis. Conclusion: A high degree of inter-operator reproducibility can be achieved by onsite machine learning based FFRCT assessment. Future research is required to evaluate the physiological relevance and prognostic value of FFRCT.


2012 ◽  
Vol 109 (3) ◽  
pp. 539-546 ◽  
Author(s):  
Michelle C. Carter ◽  
V. J. Burley ◽  
C. Nykjaer ◽  
J. E. Cade

Accurate dietary assessment is an essential foundation of research in nutritional epidemiology. Due to the weaknesses in current methodology, attention is turning to strategies that automate the dietary assessment process to improve accuracy and reduce the costs and burden to participants and researchers. ‘My Meal Mate’ (MMM) is a smartphone application designed to support weight loss. The present study aimed to validate the diet measures recorded on MMM against a reference measure of 24 h dietary recalls. A sample of fifty volunteers recorded their food and drink intake on MMM for 7 d. During this period, they were contacted twice at random to conduct 24 h telephone recalls. Daily totals for energy (kJ) and macronutrients recorded on MMM were compared against the corresponding day of recall using t tests for group means and Pearson's correlations. Bland–Altman analysis was used to assess the agreement between the methods. Energy (kJ) recorded on MMM correlated well with the recalls (day 1: r 0·77 (95 % CI 0·62, 0·86), day 2: r 0·85 (95 % CI 0·74, 0·91)) and had a small mean difference (day 1 (MMM −  recall): − 68 kJ/d (95 % CI − 553, 418 kJ) ( − 16 kcal/d, 95 % CI − 127, 100 kcal); day 2 (MMM −  recall): − 441 kJ/d (95 % CI − 854, − 29 kJ) ( − 105 kcal/d, 95 % CI − 204, − 7 kcal)). Bland–Altman analysis showed wide limits of agreement between the methods: − 3378 to 3243 kJ/d ( − 807 to 775 kcal/d) on day 1. At the individual level, the limits of agreement between MMM and the 24 h recall were wide; however, at the group level, MMM appears to have potential as a dietary assessment tool.


2015 ◽  
Vol 18 (01) ◽  
pp. 1550003
Author(s):  
Travis M. Falconer ◽  
Julie Headford ◽  
Stephen Edmondston ◽  
Piers J. Yates

The Oxford Hip Score (OHS) and Oxford Knee Score (OKS) are validated, reliable and reproducible outcome measures, however their use retrospectively has not been examined. The aim of this prospective cohort study was to examine the accuracy and reliability of patients' ability to recall their OHS and OKS in a retrospective manner. A total of 137 patients undergoing primary hip (40) or primary knee (97) arthroplasty with a mean age of 70.8 years (range, 47–88) and a mean time to follow up of 27.2 months (range, 6–46) were included in the study. The mean retrospective OHS and OKS decreased compared to the pre-operative score (OHS = 1.6 ± SD, p = 0.36, OKS = 4.7 ± SD, p < 0.001). There was only a weak positive relationship between the actual pre-operative scores and the retrospective scores (OHS: r2 = 0.30, OKS: r2 = 0.19). Bland–Altman analysis demonstrated 95% limits of agreement between scores of -19.9 to 23.1 for the OHS and -15.3 to 24.8 for the OKS. This study shows that patients are poor at retrospectively recalling their pre-operative OHS and OKS and therefore these scores should not be used in a retrospective manner.


Sign in / Sign up

Export Citation Format

Share Document