Analysis of surveillance data: A rationale for statistical tests with comments on confidence intervals and statistical models

1989 ◽  
Vol 8 (3) ◽  
pp. 273-278 ◽  
Author(s):  
David B. Hall

2009 ◽  
Vol 9 (23) ◽  
pp. 9101-9110 ◽  
Author(s):  
V. Grewe ◽  
R. Sausen

Abstract. This comment focuses on the statistical limitations of a model grading, as applied by D. Waugh and V. Eyring (2008) (WE08). The grade g is calculated for a specific diagnostic, which basically relates the difference of means of model and observational data to the standard deviation in the observational dataset. We performed Monte Carlo simulations, which show that this method has the potential to lead to large 95%-confidence intervals for the grade. Moreover, the difference between two model grades often has to be very large to become statistically significant. Since the confidence intervals were not considered in detail for all diagnostics, the grading in WE08 cannot be interpreted, without further analysis. The results of the statistical tests performed in WE08 agree with our findings. However, most of those tests are based on special cases, which implicitely assume that observations are available without any errors and that the interannual variability of the observational data and the model data are equal. Without these assumptions, the 95%-confidence intervals become even larger. Hence, the case, where we assumed perfect observations (ignored errors), provides a good estimate for an upper boundary of the threshold, below that a grade becomes statistically significant. Examples have shown that the 95%-confidence interval may even span the whole grading interval [0, 1]. Without considering confidence intervals, the grades presented in WE08 do not allow to decide whether a model result significantly deviates from reality. Neither in WE08 nor in our comment it is pointed out, which of the grades presented in WE08 inhibits such kind of significant deviation. However, our analysis of the grading method demonstrates the unacceptably high potential for these grades to be insignificant. This implies that the grades given by WE08 can not be interpreted by the reader. We further show that the inclusion of confidence intervals into the grading approach is necessary, since otherwise even a perfect model may get a low grade.



2019 ◽  
Author(s):  
Aba Szollosi ◽  
David Kellen ◽  
Danielle Navarro ◽  
Rich Shiffrin ◽  
Iris van Rooij ◽  
...  

Proponents of preregistration argue that, among other benefits, it improves the diagnosticity of statistical tests [1]. In the strong version of this argument, preregistration does this by solving statistical problems, such as family-wise error rates. In the weak version, it nudges people to think more deeply about their theories, methods, and analyses. We argue against both: the diagnosticity of statistical tests depend entirely on how well statistical models map onto underlying theories, and so improving statistical techniques does little to improve theories when the mapping is weak. There is also little reason to expect that preregistration will spontaneously help researchers to develop better theories (and, hence, better methods and analyses).



Author(s):  
Marianne Jonker ◽  
Aad Van der Vaart

AbstractIn practice, nuisance parameters in statistical models are often replaced by estimates based on an external source, for instance if estimates were published before or a second dataset is available. Next these estimates are assumed to be known when the parameter of interest is estimated, a hypothesis is tested or confidence intervals are constructed. By this assumption, the level of the test is, in general, higher than supposed and the coverage of the confidence interval is too low. In this article, we derive the asymptotic distribution of the likelihood ratio statistic if the nuisance parameters are estimated based on a dataset that is independent of the data used for estimating the parameter of interest. This distribution can be used for correctly testing hypotheses and constructing confidence intervals. Four theoretical and practical examples are given as illustration.



1992 ◽  
Vol 13 (9) ◽  
pp. 553-555 ◽  
Author(s):  
Leon F. Burmeister ◽  
David Bimbaum ◽  
Samuel B. Sheps

A variety of statistical tests of a null hypothesis commonly are used in biomedical studies. While these tests are the mainstay for justifying inferences drawn from data, they have important limitations. This report discusses the relative merits of two different approaches to data analysis and display, and recommends the use of confidence intervals rather than classic hypothesis testing.Formulae for a confidence interval surrounding the point estimate of an average value take the form: d= ±zσ/√n, where “d” represents the average difference between central and extreme values, “z” is derived from the density function of a known distribution, and “a/-∨n” represents the magnitude of sampling variability. Transposition of terms yields the familiar formula for hypothesis testing of normally distributed data (without applying the finite population correction factor): z = d/(σ/√n).



2020 ◽  
Vol 4 (3) ◽  
Author(s):  
Tengteng Wang ◽  
Hazel B Nichols ◽  
Sarah J Nyante ◽  
Patrick T Bradshaw ◽  
Patricia G Moorman ◽  
...  

Abstract Background Estrogen metabolite concentrations of 2-hydroxyestrone (2-OHE1) and 16-hydroxyestrone (16-OHE1) may be associated with breast carcinogenesis. However, no study has investigated their possible impact on mortality after breast cancer. Methods This population-based study was initiated in 1996–1997 with spot urine samples obtained shortly after diagnosis (mean = 96 days) from 683 women newly diagnosed with first primary breast cancer and 434 age-matched women without breast cancer. We measured urinary concentrations of 2-OHE1 and 16-OHE1 using an enzyme-linked immunoassay. Vital status was determined via the National Death Index (n = 244 deaths after a median of 17.7 years of follow-up). We used multivariable-adjusted Cox proportional hazards to estimate hazard ratios (HRs) and 95% confidence intervals (CIs) for the estrogen metabolites-mortality association. We evaluated effect modification using likelihood ratio tests. All statistical tests were two-sided. Results Urinary concentrations of the 2-OHE1 to 16-OHE1 ratio (>median of 1.8 vs ≤median) were inversely associated with all-cause mortality (HR = 0.74, 95% CI = 0.56 to 0.98) among women with breast cancer. Reduced hazard was also observed for breast cancer mortality (HR = 0.73, 95% CI = 0.45 to 1.17) and cardiovascular diseases mortality (HR = 0.76, 95% CI = 0.47 to 1.23), although the 95% confidence intervals included the null. Similar findings were also observed for women without breast cancer. The association with all-cause mortality was more pronounced among breast cancer participants who began chemotherapy before urine collection (n = 118, HR = 0.42, 95% CI = 0.22 to 0.81) than among those who had not (n = 559, HR = 0.98, 95% CI = 0.72 to 1.34; Pinteraction = .008). Conclusions The urinary 2-OHE1 to 16-OHE1 ratio may be inversely associated with long-term all-cause mortality, which may depend on cancer treatment status at the time of urine collection.



Sign in / Sign up

Export Citation Format

Share Document