scholarly journals Bland–Altman Limits of Agreement from a Bayesian and Frequentist Perspective

Stats ◽  
2021 ◽  
Vol 4 (4) ◽  
pp. 1080-1090
Author(s):  
Oke Gerke ◽  
Sören Möller

Bland–Altman agreement analysis has gained widespread application across disciplines, last but not least in health sciences, since its inception in the 1980s. Bayesian analysis has been on the rise due to increased computational power over time, and Alari, Kim, and Wand have put Bland–Altman Limits of Agreement in a Bayesian framework (Meas.Phys.Educ.Exerc.Sci.2021,25,137–148). We contrasted the prediction of a single future observation and the estimation of the Limits of Agreement from the frequentist and a Bayesian perspective by analyzing interrater data of two sequentially conducted, preclinical studies. The estimation of the Limits of Agreement θ1 and θ2 has wider applicability than the prediction of single future differences. While a frequentist confidence interval represents a range of nonrejectable values for null hypothesis significance testing of H0: θ1 ≤ -δ or θ2 ≥ δ against H1: θ1 > -δ and θ2 < δ, with a predefined benchmark value δ, Bayesian analysis allows for direct interpretation of both the posterior probability of the alternative hypothesis and the likelihood of parameter values. We discuss group-sequential testing and nonparametric alternatives briefly. Frequentist simplicity does not beat Bayesian interpretability due to improved computational resources, but the elicitation and implementation of prior information demand caution. Accounting for clustered data (e.g., repeated measurements per subject) is well-established in frequentist, but not yet in Bayesian Bland–Altman analysis.

2021 ◽  
Author(s):  
A Wallin ◽  
M Kierkegaard ◽  
E Franzén ◽  
S Johansson

Abstract Objective The mini-BESTest is a balance measure for assessment of the underlying physiological systems for balance control in adults. Evaluations of test–retest reliability of the mini-BESTest in larger samples of people with multiple sclerosis (MS) are lacking. The purpose of this study was to investigate test–retest reliability of the mini-BESTest total and section sum scores and individual items in people with mild to moderate overall MS disability. Methods This study used a test–retest design in a movement laboratory setting. Fifty-four people with mild to moderate overall MS disability according to the Expanded Disability Status scale (EDSS) were included, with 28 in the mild subgroup (EDSS 2.0–3.5) and 26 in the moderate subgroup (EDSS 4.0–5.5). Test–retest reliability of the mini-BESTest was evaluated by repeated measurements taken 1 week apart. Reliability and measurement error were analyzed. Results Test–retest reliability for the total scores were considered good to excellent, with intraclass correlation coefficients of .88 for the whole sample, .83 for the mild MS subgroup, and .80 for the moderate MS subgroup. Measurement errors were small, with standard error of measurement and minimal detectable change of 1.3 and 3.5, respectively, in mild MS, and 1.7 and 4.7, respectively, in moderate MS. The limits of agreement were − 3.4 and 4.6. Test–retest reliability for the section scores were fair to good or excellent; weighted kappa values ranged from .62 to .83. All items but 1 showed fair to good or excellent test–retest reliability, and percentage agreement ranged from 61% to 100%. Conclusions The mini-BESTest demonstrated good to excellent test–retest reliability and small measurement errors and is recommended for use in people with mild to moderate MS. Impact Knowledge of limits of agreement and minimal detectable change contribute to interpretability of the mini-BESTest total score. The findings of this study enhance the clinical usefulness of the test for evaluation of balance control and for designing individually customized balance training with high precision and accuracy in people with MS.


Sports ◽  
2018 ◽  
Vol 6 (4) ◽  
pp. 150
Author(s):  
Nico Nitzsche ◽  
Lutz Baumgärtel ◽  
Christian Maiwald ◽  
Henry Schulz

(1) Background: Maximum isokinetic force loads show strongly increased post-load lactate concentrations and an increase in the maximum blood lactate concentration rate ( V ˙ Lamax), depending on load duration. The reproducibility of V ˙ Lamax must be known to be able to better assess training-related adjustments of anaerobic performance using isokinetic force tests. (2) Methods: 32 subjects were assigned to two groups and completed two unilateral isokinetic force tests (210° s−1, Range of Motion 90°) within seven days. Group 1 (n = 16; age 24.0 ± 2.8 years, BMI 23.5 ± 2.6 kg m−2, training duration: 4.5 ± 2.4 h week−1) completed eight repetitions and group 2 (n = 16; age 23.7 ± 1.9 years, BMI 24.6 ± 2.4 kg m−2, training duration: 5.5 ± 2.1 h week−1) completed 16 repetitions. To determine V ˙ Lamax, capillary blood (20 µL) was taken before and immediately after loading, and up to the 9th minute post-load. Reproducibility and variability was determined using Pearson and Spearman correlation analyses, and variability were determined using within-subject standard deviation (Sw) and Limits of Agreement (LoA) using Bland Altman plots. (3) Results: The correlation of V ˙ Lamax in group 1 was r = 0.721, and in group 2 r = 0.677. The Sw of V ˙ Lamax was 0.04 mmol L−1 s−1 in both groups. In group 1, V ˙ Lamax showed a systematic bias due to measurement repetition of 0.02 mmol L−1 s−1 in an interval (LoA) of ±0.11 mmol L−1 s−1. In group 2, a systematic bias of −0.008 mmol L−1 s−1 at an interval (LoA) of ±0.11 mmol L−1 s−1 was observed for repeated measurements of V ˙ Lamax. (4) Conclusions: Based on the existing variability, a reliable calculation of V ˙ Lamax seems to be possible with both short and longer isokinetic force loads. Changes in V ˙ Lamax above 0.11 mmol L−1 s−1 due to training can be described as a non-random increase or decrease in V ˙ Lamax.


2014 ◽  
Vol 94 (1) ◽  
pp. 129-138 ◽  
Author(s):  
Li-ling Chuang ◽  
Ching-yi Wu ◽  
Keh-chung Lin ◽  
Ching-ju Hsieh

BackgroundPain is a serious adverse complication after stroke. The combination of a vertical numerical pain rating scale (NPRS) and a faces pain scale (FPS) has been advocated to measure pain after stroke.ObjectiveThis study was conducted to investigate whether an NPRS supplemented with an FPS (NPRS-FPS) would show good test-retest reliability in people with stroke. The relative and absolute reliability of the NPRS-FPS were examined.DesignA test-retest design was used for this study.MethodsFifty people (&gt;3 months after stroke) participating in an outpatient occupational therapy program were recruited through medical centers to rate current pain intensity twice, at a 1-week interval, with the NPRS-FPS (on a scale from 0 to 10). The relative reliability of the NPRS-FPS was analyzed with the intraclass correlation coefficient for determining the degree of consistency and agreement between 2 measures. The standard error of measurement, the smallest real difference, and Bland-Altman limits of agreement were the absolute reliability indexes used to quantify measurement errors and determine systematic biases of repeated measurements.ResultsThe relative reliability of the NPRS-FPS was substantial (intraclass correlation coefficient=.82). The standard error of measurement and the smallest real difference at the 90% confidence interval of the NPRS-FPS were 0.81 and 1.87, respectively. The Bland-Altman analyses revealed no significant systematic bias between repeated measurements for the NPRS-FPS. The range of the limits of agreement for the NPRS-FPS was narrow (−2.50 to 1.90), indicating a high level of stability and little variation over time.LimitationsThe pain intensity of the participants ranged from no pain to a moderate level of pain.ConclusionsThese findings suggest that the NPRS-FPS is a reliable measure of pain in people with stroke, with good relative and absolute reliability.


2009 ◽  
Vol 19 (5) ◽  
pp. 494-500 ◽  
Author(s):  
Miranda J. J. Geelhoed ◽  
Sonja P. E. Snijders ◽  
Veronica E. Kleyburg-Linkers ◽  
Eric A. P. Steegers ◽  
Lennie van Osch-Gevers ◽  
...  

AbstractBackgroundEchocardiographic measurements are widely used as outcomes of different studies. The aim of this study was to assess intraobserver and interobserver reliability of echocardiographic measurements in healthy children.Materials and methodsWe studied 28 children, with a median age of 7.5 years, and inter-quartile range from 3 to 11 years. Intraobserver and interobserver reliability were assessed by repeated measurements of the diameters of the aortic root, the left atrium, and left ventricular end-diastolic structure. We also measured the ventricular end-diastolic septal thickness and the end-diastolic thickness of the left ventricular posterior wall. We calculated intraclass correlation coefficients, with corresponding 95% confidence intervals, and computed Bland and Altman plots, permitting us to derive limits of agreement plus or minus 2 standard deviations for the mean differences in cardiac measurements.ResultsWe found high intraobserver and interobserver intraclass correlation coefficient, ranging from 0.91 for ventricular septal thickness, with 95% confidence intervals from 0.78 to 0.96, to 0.99 for the diameter of the aortic root, 95% confidence interval from 0.97 to 1.00. Limits of agreement in the Bland and Altman plots ranged from zero millimetres for left ventricular end-diastolic posterior wall thickness to 1.60 millimeters (6.3%) for left atrial diameter.ConclusionsOur study demonstrated good repeatability and reproducibility for ultrasonic measurements of left cardiac structures in children, showing that values obtained for measurement of these structures in both clinical and epidemiological research projects can be confidently accepted.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Siti Hanum Mohd Ali ◽  
Normaliza Omar ◽  
Mohamed Swarhib Shafie ◽  
Nik Azuan Nik Ismail ◽  
Helmi Hadi ◽  
...  

Abstract Background Sex estimation using the subpubic angle of the pelvis is highly accurate for identification of unknown skeletonized remains. This study compared two methods for measuring the subpubic angle from reconstructed three-dimensional (3D) pelvic models. The aims were to quantify the differences in the subpubic angle measurement by Checkpoint (Method 1) and MeshLab + OnScreenProtractor (Method 2), to determine the 95% limits of agreement and to identify any measurement bias. Multislice computed tomography (MSCT) scans of 85 individuals were used in this study. The MSCT scans were performed on a Siemens SOMATOM Sensation 64 scanner (Siemens Germany Ltd.). Segmentation of the MSCT scans was performed using 3D Slicer to reconstruct 3D pelvic models. Subpubic angle was measured on Checkpoint using four landmarks (Method 1), and with OnScreenProtractor on MeshLab (Method 2). Results The intraclass correlation coefficient (ICC) showed a high correlation between repeated measurements in both methods. Subpubic angle measurements by Method 1 and Method 2 were significantly different (p < 0.05). Method 2 (M = 82.2°, SD = 13.5°), consistently showed a larger subpubic angle measurement than Method 1 (M = 77.3°, SD = 12.3°) (consistent bias). More than 95% of the differences (82/85) between Checkpoint and MeshLab fell within the 95% limits of agreement (− 1.4° and 11.4°). Conclusion Checkpoint and MeshLab displayed significantly different subpubic angle measurement on a 3D pelvic model, but within the 95% limits of agreement. The MeshLab tended to give a larger measurement (5°), across the magnitude of the subpubic angle. The decision to use the two methods interchangeably depended on the clinical judgment of the observer.


Author(s):  
Sirkka-Liisa Lauronen ◽  
Maija-Liisa Kalliomäki ◽  
Jarkko Kalliovalkama ◽  
Antti Aho ◽  
Heini Huhtala ◽  
...  

AbstractBecause of the difficulties involved in the invasive monitoring of conscious patients, core temperature monitoring is frequently neglected during neuraxial anaesthesia. Zero heat flux (ZHF) and double sensor (DS) are non-invasive methods that measure core temperature from the forehead skin. Here, we compare these methods in patients under spinal anaesthesia. Sixty patients scheduled for elective unilateral knee arthroplasty were recruited and divided into two groups. Of these, thirty patients were fitted with bilateral ZHF sensors (ZHF group), and thirty patients were fitted with both a ZHF sensor and a DS sensor (DS group). Temperatures were saved at 5-min intervals from the beginning of prewarming up to one hour postoperatively. Bland–Altman analysis for repeated measurements was performed and a proportion of differences within 0.5 °C was calculated as well as Lin`s concordance correlation coefficient (LCCC). A total of 1261 and 1129 measurement pairs were obtained. The mean difference between ZHF sensors was 0.05 °C with 95% limits of agreement − 0.36 to 0.47 °C, 99% of the readings were within 0.5 °C and LCCC was 0.88. The mean difference between ZHF and DS sensors was 0.33 °C with 95% limits of agreement − 0.55 to 1.21 °C, 66% of readings were within 0.5 °C and LCCC was 0.59. Bilaterally measured ZHF temperatures were almost identical. DS temperatures were mostly lower than ZHF temperatures. The mean difference between ZHF and DS temperatures increased when the core temperature decreased.Trial registration: The study was registered in ClinicalTrials.gov on 13th May 2019, Code NCT03408197.


Author(s):  
Innocent Boyle Eraikhuemen ◽  
Olateju Alao Bamigbala ◽  
Umar Alhaji Magaji ◽  
Bassa Shiwaye Yakura ◽  
Kabiru Ahmed Manju

In the present paper, a three-parameter Weibull-Lindley distribution is considered for Bayesian analysis. The estimation of a shape parameter of Weibull-Lindley distribution is obtained with the help of both the classical and Bayesian methods. Bayesian estimators are obtained by using Jeffrey’s prior, uniform prior and Gamma prior under square error loss function, quadratic loss function and Precautionary loss function. Estimation by the method of Maximum likelihood is also discussed. These methods are compared by using mean square error through simulation study with varying parameter values and sample sizes.


Sign in / Sign up

Export Citation Format

Share Document