Assessing bias, precision, and agreement in method comparison studies

2019 ◽  
Vol 29 (3) ◽  
pp. 778-796 ◽  
Author(s):  
Patrick Taffé

Recently, a new estimation procedure has been developed to assess bias and precision of a new measurement method, relative to a reference standard. However, the author did not develop confidence bands around the bias and standard deviation curves. Therefore, the goal in this paper is to extend this methodology in several important directions. First, by developing simultaneous confidence bands for the various parameters estimated to allow formal comparisons between different measurement methods. Second, by proposing a new index of agreement. Third, by providing a series of new graphs to help the investigator to assess bias, precision, and agreement between the two measurement methods. The methodology requires repeated measurements on each individual for at least one of the two measurement methods. It works very well to estimate the differential and proportional biases, even with as few as two to three measurements by one of the two methods and only one by the other. The repeated measurements need not come from the reference standard but from either measurement methods. This is a great advantage as it may sometimes be more feasible to gather repeated measurements with the new measurement method.

Author(s):  
Patrick Taffé ◽  
Mingkai Peng ◽  
Vicki Stagg ◽  
Tyler Williamson

Bland and Altman's (1986, Lancet 327: 307–310) limits of agreement have been used in many clinical research settings to assess agreement between two methods of measuring a quantitative characteristic. However, when the variances of the measurement errors of the two methods differ, limits of agreement can be misleading. biasplot implements a new statistical methodology that Taffé (Forthcoming, Statistical Methods in Medical Research) recently developed to circumvent this issue and assess bias and precision of the two measurement methods (one is the reference standard, and the other is the new measurement method to be evaluated). biasplot produces three new plots introduced by Taffé: the “bias plot”, “precision plot”, and “comparison plot”. These help the investigator visually evaluate the performance of the new measurement method. In this article, we introduce the user-written command biasplot and present worked examples using simulated data included with the package. Note that the Taffé method assumes there are several measurements from the reference standard and possibly as few as one measurement from the new method for each individual.


2016 ◽  
Vol 27 (6) ◽  
pp. 1650-1660 ◽  
Author(s):  
Patrick Taffé

Bland and Altman’s limits of agreement have traditionally been used in clinical research to assess the agreement between different methods of measurement for quantitative variables. However, when the variances of the measurement errors of the two methods are different, Bland and Altman’s plot may be misleading; there are settings where the regression line shows an upward or a downward trend but there is no bias or a zero slope and there is a bias. Therefore, the goal of this paper is to clearly illustrate why and when does a bias arise, particularly when heteroscedastic measurement errors are expected, and propose two new plots, the “bias plot” and the “precision plot,” to help the investigator visually and clinically appraise the performance of the new method. These plots do not have the above-mentioned defect and still are easy to interpret, in the spirit of Bland and Altman’s limits of agreement. To achieve this goal, we rely on the modeling framework recently developed by Nawarathna and Choudhary, which allows the measurement errors to be heteroscedastic and depend on the underlying latent trait. Their estimation procedure, however, is complex and rather daunting to implement. We have, therefore, developed a new estimation procedure, which is much simpler to implement and, yet, performs very well, as illustrated by our simulations. The methodology requires several measurements with the reference standard and possibly only one with the new method for each individual.


2018 ◽  
Vol 28 (8) ◽  
pp. 2557-2565 ◽  
Author(s):  
Patrick Taffé ◽  
Mingkai Peng ◽  
Victoria Stagg ◽  
Tyler Williamson

Bland and Altman’s limits of agreement have been used in many clinical research settings to assess agreement between two methods of measuring a quantitative trait. However, when the variances of the measurement errors of the two methods are different, limits of agreement can be misleading. MethodCompare is an R package that implements a new statistical methodology, developed by Taffé in 2016. MethodCompare produces three new plots, the “bias plot”, the “precision plot”, and the “comparison plot” to visually evaluate the performance of the new measurement method against the reference method. The method is illustrated on three simulated examples. Note that the Taffé method assumes that there are several measurements from reference standard and possibly as few as one measurement from the new method for each individual.


It is now generally recognised that future definitions of the units of length will probably be based on the length of a wave of visible light. At present the wave-length of the red radiation of cadmium serves as the basis of all measurements of the lengths of electro-magnetic waves which are perceptible by optical means, and provisional sanction has been given to measurements of length on the same basis, as an alternative to direct reference to the metre. Whether the cadmium red radiation provides the best reference standard for all measurements of length has not yet been definitely established. Two international committees, one representing spectroscopists and the other metrologists, have sanctioned standard specifications for cadmium lamps of the Michelson type from which the red radiation may be produced. The two specifications differ from one another in certain details, but both are subject to the same objections. These objections are directed partly against the high temperature at which it is necessary to run the lamp and partly against the high voltage required to excite the radiation. Therefore, such hyperfine structure and asymmetry as may be present in the red line of cadmium is likely to be masked in the Michelson lamp by a combination of two phenomena —the enhanced Doppler effect due to the high temperature of the radiating cadmium atoms, and the effect of the moderately high intensity of the electric field. Were this not so, it might be somewhat surprising that no definite evidence of fine structure or asymmetry had so far been observed in the red line from the Michelson lamp, notwithstanding the many careful examinations, with the aid of the most sensitive interferometers, to which this line has been subjected, in view of its importance as the reference standard for all other wave-lengths. Recently Nagaoka and Sugiura have recorded that they have observed slight evidences of structure in the red radiation when excited under special conditions in which great precautions were taken to ensure extreme sharpness of the line. It is believed, however, that no subsequent confirmation of this effect has yet been published.


2013 ◽  
Vol 347-350 ◽  
pp. 197-200
Author(s):  
Yu Gong ◽  
Jing Cai Zhang ◽  
Hong Qi Liu

In this paper, research on measurement methods of hole during the parts online detection has been made. Both diameter and position of the hole are going to be detected in the same measurement system. In order to obtain higher accuracy and efficiency, a comparative analysis test of using the contact probes, the inductive sensor, the laser sensor, the forward and back lighting CCD imaging have been achieved. Results show that the contact measurement using inductive sensor is more suitable for the system, for the reason that it has higher reliability and efficiency.


1997 ◽  
Vol 119 (2) ◽  
pp. 236-242 ◽  
Author(s):  
K. Peleg

The classical calibration problem is primarily concerned with comparing an approximate measurement method with a very precise one. Frequently, both measurement methods are very noisy, so we cannot regard either method as giving the true value of the quantity being measured. Sometimes, it is desired to replace a destructive or slow measurement method, by a noninvasive, faster or less expensive one. The simplest solution is to cross calibrate one measurement method in terms of the other. The common practice is to use regression models, as cross calibration formulas. However, such models do not attempt to discriminate between the clutter and the true functional relationship between the cross calibrated measurement methods. A new approach is proposed, based on minimizing the sum of squares of the differences between the absolute values of the Fast Fourier Transform (FFT) series, derived from the readings of the cross calibrated measurement methods. The line taken is illustrated by cross calibration examples of simulated linear and nonlinear measurement systems, with various levels of additive noise, wherein the new method is compared to the classical regression techniques. It is shown, that the new method can discover better the true functional relationship between two measurement systems, which is occluded by the noise.


2017 ◽  
Author(s):  
Carolyn Judge ◽  
Bill Beaver ◽  
John Zseleczk

The resistance of a planing hull is known to be highly dependent on trim angle. For several reasons, trim is difficult to measure to the level of accuracy normally attained with other towing tank measurements such as resistance or speed. In a recent study intended to validate CFD methods for planing hulls, 4’ and 8’ long geosim models of the Generic Prismatic Planing Hull (GPPH) were built and tested at USNA. Significant differences were found between the trim of the two models so a separate test program was conducted which focused specifically on the trim measurement of these two models in calm water. Five different trim measurement methods were used simultaneously on one model and then used again on the other model. Trim angles were compared between measurement methods and between models. Trim measurements with the same model agreed well and are the basis for an evaluation of measurement methods. The trim measured on the two different size models did not agree well even though the same instruments were used in most cases. The paper discusses reasons for the confirmed differences in calm water running trim of the two models and suggests ways to take advantage of this knowledge to make the best use of towing tank tests for planing boat performance prediction.


Sign in / Sign up

Export Citation Format

Share Document