Statistical and Clinical Interpretation of Research Results

2007 ◽  
Vol 97 (2) ◽  
pp. 165-170 ◽  
Author(s):  
Garry T. Allison

There is a well-known phenomenon of publication bias toward manuscripts that report statistically significant differences. The clinical implications of these statistically significant differences are not always clear because the magnitude of the changes may be clinically meaningless. This article relates the critical P value threshold to the magnitude of the actual observed change and provides a rationale for reporting confidence intervals in clinical studies. Strategies for improving statistical power and reducing the magnitude of the confidence interval range for clinical trials are also described. (J Am Podiatr Med Assoc 97(2): 165–170, 2007)

1998 ◽  
Vol 26 (2) ◽  
pp. 57-65 ◽  
Author(s):  
R Kay

If a trial is to be well designed, and the conclusions drawn from it valid, a thorough understanding of the benefits and pitfalls of basic statistical principles is required. When setting up a trial, appropriate sample-size calculation is vital. If initial calculations are inaccurate, trial results will be unreliable. The principle of intent-to-treat in comparative trials is examined. Randomization as a method of selecting patients to treatment is essential to ensure that the treatment groups are equalized in terms of avoiding biased allocation in the mix of patients within groups. Once trial results are available the correct calculation and interpretation of the P-value is important. Its limitations are examined, and the use of the confidence interval to help draw valid conclusions regarding the clinical value of treatments is explored.


2009 ◽  
Vol 33 (2) ◽  
pp. 87-90 ◽  
Author(s):  
Douglas Curran-Everett

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This third installment of Explorations in Statistics investigates confidence intervals. A confidence interval is a range that we expect, with some level of confidence, to include the true value of a population parameter such as the mean. A confidence interval provides the same statistical information as the P value from a hypothesis test, but it circumvents the drawbacks of that hypothesis test. Even more important, a confidence interval focuses our attention on the scientific importance of some experimental result.


2019 ◽  
Author(s):  
Xiaokang Lyu ◽  
Yuepei Xu ◽  
Xiaofan Zhao ◽  
Xi-Nian Zuo ◽  
Hu Chuan-Peng

P-value and confidence intervals (CIs) are the most widely used statistical indices in scientific literature. Several surveys revealed that these two indices are generally misunderstood. However, existing surveys on this subject fall under psychology and biomedical research, and data from other disciplines are rare. Moreover, the confidence of researchers when constructing judgments remains unclear. To fill this research gap, we survey 1,479 researchers and students from different fields in China. Results reveal that for significant (p < .05, CI doesn’t include 0) and non-significant (p > .05, CI includes 0) conditions, most respondents, regardless of academic degrees, research fields, and stages of career, could not interpret p-value and CI accurately. Moreover, the majority of them are confident about their (inaccurate) judgments (see osf.io/mcu9q/ for raw data, materials, and supplementary analyses). Therefore, misinterpretations of p-value and CIs prevail in the whole scientific community, thus the need for statistical training in science.


2019 ◽  
Author(s):  
Marshall A. Taylor

Coefficient plots are a popular tool for visualizing regression estimates. The appeal of these plots is that they visualize confidence intervals around the estimates and generally center the plot around zero, meaning that any estimate that crosses zero is statistically non-significant at at least the alpha-level around which the confidence intervals are constructed. For models with statistical significance levels determined via randomization models of inference and for which there is no standard error or confidence intervals for the estimate itself, these plots appear less useful. In this paper, I illustrate a variant of the coefficient plot for regression models with p-values constructed using permutation tests. These visualizations plot each estimate's p-value and its associated confidence interval in relation to a specified alpha-level. These plots can help the analyst interpret and report both the statistical and substantive significance of their models. Illustrations are provided using a nonprobability sample of activists and participants at a 1962 anti-Communism school.


Author(s):  
S. Nassir Ghaemi

P-values are overused and abused in clinical research. An important clinical consequence is that side effects are not interpreted correctly in clinical trials. P-values are usable legitimately for the primary outcome of clinical efficacy in most clinical trials. The many potential side effects are assessed better with effect estimates and confidence intervals. The use of p-values for side effects leads to both underestimation (high p-values based on insufficient statistical power) and overestimation (low p-values based on multiple comparisons). These results are both falsely negative and falsely positive. A more valid approach is provided. Generalizability of clinical trials also is investigated, with the conclusion that clinical trials provide average results for the average patient, and should not be interpreted as applying wholesale to each patient.


Author(s):  
Marshall A. Taylor

Coefficient plots are a popular tool for visualizing regression estimates. The appeal of these plots is that they visualize confidence intervals around the estimates and generally center the plot around zero, meaning that any estimate that crosses zero is statistically nonsignificant at least at the alpha level around which the confidence intervals are constructed. For models with statistical significance levels determined via randomization models of inference and for which there is no standard error or confidence intervals for the estimate itself, these plots appear less useful. In this article, I illustrate a variant of the coefficient plot for regression models with p-values constructed using permutation tests. These visualizations plot each estimate’s p-value and its associated confidence interval in relation to a specified alpha level. These plots can help the analyst interpret and report the statistical and substantive significances of their models. I illustrate using a nonprobability sample of activists and participants at a 1962 anticommunism school.


BMJ Open ◽  
2019 ◽  
Vol 9 (11) ◽  
pp. e032506 ◽  
Author(s):  
Adrian Gerard Barnett ◽  
Jonathan D Wren

ObjectivesPrevious research has shown clear biases in the distribution of published p values, with an excess below the 0.05 threshold due to a combination of p-hacking and publication bias. We aimed to examine the bias for statistical significance using published confidence intervals.DesignObservational study.SettingPapers published inMedlinesince 1976.ParticipantsOver 968 000 confidence intervals extracted from abstracts and over 350 000 intervals extracted from the full-text.Outcome measuresCumulative distributions of lower and upper confidence interval limits for ratio estimates.ResultsWe found an excess of statistically significant results with a glut of lower intervals just above one and upper intervals just below 1. These excesses have not improved in recent years. The excesses did not appear in a set of over 100 000 confidence intervals that were not subject to p-hacking or publication bias.ConclusionsThe huge excesses of published confidence intervals that are just below the statistically significant threshold are not statistically plausible. Large improvements in research practice are needed to provide more results that better reflect the truth.


2020 ◽  
Vol 14 ◽  
Author(s):  
Xiao-Kang Lyu ◽  
Yuepei Xu ◽  
Xiao-Fan Zhao ◽  
Xi-Nian Zuo ◽  
Chuan-Peng Hu

Abstract P values and confidence intervals (CIs) are the most widely used statistical indices in scientific literature. Several surveys have revealed that these two indices are generally misunderstood. However, existing surveys on this subject fall under psychology and biomedical research, and data from other disciplines are rare. Moreover, the confidence of researchers when constructing judgments remains unclear. To fill this research gap, we surveyed 1,479 researchers and students from different fields in China. Results reveal that for significant (i.e., p < .05, CI does not include zero) and non-significant (i.e., p > .05, CI includes zero) conditions, most respondents, regardless of academic degrees, research fields and stages of career, could not interpret p values and CIs accurately. Moreover, the majority were confident about their (inaccurate) judgements (see osf.io/mcu9q/ for raw data, materials, and supplementary analyses). Therefore, as misinterpretations of p values and CIs prevail in the whole scientific community, there is a need for better statistical training in science.


2017 ◽  
Author(s):  
Herm J. Lamberink ◽  
Willem M. Otte ◽  
Michel R.T. Sinke ◽  
Daniël Lakens ◽  
Paul P. Glasziou ◽  
...  

AbstractBackgroundBiomedical studies with low statistical power are a major concern in the scientific community and are one of the underlying reasons for the reproducibility crisis in science. If randomized clinical trials, which are considered the backbone of evidence-based medicine, also suffer from low power, this could affect medical practice.MethodsWe analysed the statistical power in 137 032 clinical trials between 1975 and 2017 extracted from meta-analyses from the Cochrane database of systematic reviews. We determined study power to detect standardized effect sizes according to Cohen, and in meta-analysis with p-value below 0.05 we based power on the meta-analysed effect size. Average power, effect size and temporal patterns were examined.ResultsThe number of trials with power ≥80% was low but increased over time: from 9% in 1975–1979 to 15% in 2010–2014. This increase was mainly due to increasing sample sizes, whilst effect sizes remained stable with a median Cohen’s h of 0.21 (IQR 0.12-0.36) and a median Cohen’s d of 0.31 (0.19-0.51). The proportion of trials with power of at least 80% to detect a standardized effect size of 0.2 (small), 0.5 (moderate) and 0.8 (large) was 7%, 48% and 81%, respectively.ConclusionsThis study demonstrates that sufficient power in clinical trials is still problematic, although the situation is slowly improving. Our data encourages further efforts to increase statistical power in clinical trials to guarantee rigorous and reproducible evidence-based medicine.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


Sign in / Sign up

Export Citation Format

Share Document