scholarly journals True and false positive rates for different criteria of evaluating statistical evidence from clinical trials

2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Don van Ravenzwaaij ◽  
John P. A. Ioannidis

Abstract Background Until recently a typical rule that has often been used for the endorsement of new medications by the Food and Drug Administration has been the existence of at least two statistically significant clinical trials favoring the new medication. This rule has consequences for the true positive (endorsement of an effective treatment) and false positive rates (endorsement of an ineffective treatment). Methods In this paper, we compare true positive and false positive rates for different evaluation criteria through simulations that rely on (1) conventional p-values; (2) confidence intervals based on meta-analyses assuming fixed or random effects; and (3) Bayes factors. We varied threshold levels for statistical evidence, thresholds for what constitutes a clinically meaningful treatment effect, and number of trials conducted. Results Our results show that Bayes factors, meta-analytic confidence intervals, and p-values often have similar performance. Bayes factors may perform better when the number of trials conducted is high and when trials have small sample sizes and clinically meaningful effects are not small, particularly in fields where the number of non-zero effects is relatively large. Conclusions Thinking about realistic effect sizes in conjunction with desirable levels of statistical evidence, as well as quantifying statistical evidence with Bayes factors may help improve decision-making in some circumstances.

2019 ◽  
Author(s):  
Don van Ravenzwaaij ◽  
John P A Ioannidis

Abstract Background: Until recently a typical rule that has often been used for the endorsement of new medications by the Food and Drug Administration has been the existence of at least two statistically significant clinical trials favoring the new medication. This rule has consequences for the true positive (endorsement of an effective treatment) and false positive rates (endorsement of an ineffective treatment). Methods: In this paper, we compare true positive and false positive rates for different evaluation criteria through simulations that rely on (1) conventional p-values; (2) confidence intervals based on meta-analyses assuming fixed or random effects; and (3) Bayes factors. We varied threshold levels for statistical evidence, thresholds for what constitutes a clinically meaningful treatment effect, and number of trials conducted. Results: Our results show that Bayes factors, meta-analytic confidence intervals, and p-values often have similar performance. Bayes factors may perform better when the number of trials conducted is high and when trials have small sample sizes and clinically meaningful effects are not small, particularly in fields where the number of non-zero effects is relatively large. Conclusions: Thinking about realistic effect sizes in conjunction with desirable levels of statistical evidence, as well as quantifying statistical evidence with Bayes factors may help improve decision-making in some circumstances.


2019 ◽  
Author(s):  
Don van Ravenzwaaij ◽  
John P A Ioannidis

Abstract Background: Until recently a typical rule that has often been often used for the endorsement of new medications by the Food and Drug Administration has been the existence of at least two statistically significant clinical trials favoring the new medication. This rule has consequences for the true positive (endorsement of an effective treatment) and false positive (endorsement of an ineffective treatment) rates. Methods: In this paper, we compare true positive and false positive rates for different evaluation criteria through simulations that rely on (1) conventional p -values; (2) confidence intervals based on meta-analyses assuming fixed or random effects; and (3) Bayes factors. We varied threshold levels for statistical evidence, and thresholds for what constitutes a clinically meaningful treatment effect. Results: Our results show that Bayes factors, meta-analytic confidence intervals, and p-values often have similar performance. Bayes factors may perform better when trials have small sample sizes and clinically meaningful effects are not small, particularly in fields where the number of non-zero effects is relatively large. Conclusions: Thinking about realistic effect sizes in conjunction with desirable levels of statistical evidence, as well as quantifying statistical evidence with Bayes factors may help improve decision-making in some circumstances.


2018 ◽  
Author(s):  
Don van Ravenzwaaij ◽  
john Ioannidis

Background: Until recently a typical rule that has often been used for the endorsement of new medications by the Food and Drug Administration has been the existence of at least two statistically significant clinical trials favoring the new medication. This rule has consequences for the true positive (endorsement of an effective treatment) and false positive rates (endorsement of an ineffective treatment). Methods: In this paper, we compare true positive and false positive rates for different evaluation criteria through simulations that rely on (1) conventional p-values; (2) confidence intervals based on meta-analyses assuming fixed or random effects; and (3) Bayes factors. We varied threshold levels for statistical evidence, thresholds for what constitutes a clinically meaningful treatment effect, and number of trials conducted. Results: Our results show that Bayes factors, meta-analytic confidence intervals, and p-values often have similar performance. Bayes factors may perform better when the number of trials conducted is high and when trials have small sample sizes and clinically meaningful effects are not small, particularly in fields where the number of non-zero effects is relatively large. Conclusions: Thinking about realistic effect sizes in conjunction with desirable levels of statistical evidence, as well as quantifying statistical evidence with Bayes factors may help improve decision-making in some circumstances.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e18600-e18600
Author(s):  
Maryam Alasfour ◽  
Salman Alawadi ◽  
Malak AlMojel ◽  
Philippos Apolinario Costa ◽  
Priscila Barreto Coelho ◽  
...  

e18600 Background: Patients with coronavirus disease 2019 (COVID-19) and cancer have worse clinical outcomes compared to those without cancer. Primary studies have examined this population, but most had small sample sizes and conflicting results. Prior meta-analyses exclude most US and European data or only examine mortality. The present meta-analysis evaluates the prevalence of several clinical outcomes in cancer patients with COVID-19, including new emerging data from Europe and the US. Methods: A systematic search of PubMED, medRxiv, JMIR and Embase by two independent investigators included peer-reviewed papers and preprints up to July 8, 2020. The primary outcome was mortality. Other outcomes were ICU and non-ICU admission, mild, moderate and severe complications, ARDS, invasive ventilation, stable, and clinically improved rates. Study quality was assessed through the Newcastle–Ottawa scale. Random effects model was used to derive prevalence rates, their 95% confidence intervals (CI) and 95% prediction intervals (PI). Results: Thirty-four studies (N = 4,371) were included in the analysis. The mortality prevalence rate was 25.2% (95% CI: 21.1–29.7; 95% PI: 9.8-51.1; I 2 = 85.4), with 11.9% ICU admissions (95% CI: 9.2-15.4; 95% PI: 4.3-28.9; I 2= 77.8) and 25.2% clinically stable (95% CI: 21.1-29.7; 95% PI: 9.8-51.1; I 2 = 85.4). Furthermore, 42.5% developed severe complications (95% CI: 30.4-55.7; 95% PI: 8.2-85.9; I 2 = 94.3), with 22.7% developing ARDS (95% CI: 15.4-32.2; 95% PI: 5.8-58.6; I 2 = 82.4), and 11.3% needing invasive ventilation (95% CI: 6.7-18.4; 95% PI: 2.3-41.1; I 2 = 79.8). Post-follow up, 49% clinically improved (95% CI: 35.6-62.6; 95% PI: 9.8-89.4; I 2 = 92.5). All outcomes had large I 2 , suggesting high levels of heterogeneity among studies, and wide PIs indicating high variability within outcomes. Despite this variability, the mortality rate in cancer patients with COVID-19, even at the lower end of the PI (9.8%), is higher than the 2% mortality rate of the non-cancer with COVID-19 population, but not as high as what other meta-analyses conclude, which is around 25%. Conclusions: Patients with cancer who develop COVID-19 have a higher probability of mortality compared to the general population with COVID-19, but possibly not as high as previous studies have shown. A large proportion of them developed severe complications, but a larger proportion recovered. Prevalence of mortality and other outcomes published in prior meta-analyses did not report prediction intervals, which compromises the clinical utilization of such results.


2014 ◽  
Vol 26 (2) ◽  
pp. 598-614 ◽  
Author(s):  
Julia Poirier ◽  
GY Zou ◽  
John Koval

Cluster randomization trials, in which intact social units are randomized to different interventions, have become popular in the last 25 years. Outcomes from these trials in many cases are positively skewed, following approximately lognormal distributions. When inference is focused on the difference between treatment arm arithmetic means, existent confidence interval procedures either make restricting assumptions or are complex to implement. We approach this problem by assuming log-transformed outcomes from each treatment arm follow a one-way random effects model. The treatment arm means are functions of multiple parameters for which separate confidence intervals are readily available, suggesting that the method of variance estimates recovery may be applied to obtain closed-form confidence intervals. A simulation study showed that this simple approach performs well in small sample sizes in terms of empirical coverage, relatively balanced tail errors, and interval widths as compared to existing methods. The methods are illustrated using data arising from a cluster randomization trial investigating a critical pathway for the treatment of community acquired pneumonia.


1997 ◽  
Vol 54 (3) ◽  
pp. 616-630 ◽  
Author(s):  
S J Smith

Trawl surveys using stratified random designs are widely used on the east coast of North America to monitor groundfish populations. Statistical quantities estimated from these surveys are derived via a randomization basis and do not require that a probability model be postulated for the data. However, the large sample properties of these estimates may not be appropriate for the small sample sizes and skewed data characteristic of bottom trawl surveys. In this paper, three bootstrap resampling strategies that incorporate complex sampling designs are used to explore the properties of estimates for small sample situations. A new form for the bias-corrected and accelerated confidence intervals is introduced for stratified random surveys. Simulation results indicate that the bias-corrected and accelerated confidence limits may overcorrect for the trawl survey data and that percentile limits were closer to the expected values. Nonparametric density estimates were used to investigate the effects of unusually large catches of fish on the bootstrap estimates and confidence intervals. Bootstrap variance estimates decreased as increasingly smoother distributions were assumed for the observations in the stratum with the large catch. Lower confidence limits generally increased with increasing smoothness but the upper bound depended upon assumptions about the shape of the distribution.


Author(s):  
Tianye Jia ◽  
Congying Chu ◽  
Yun Liu ◽  
Jenny van Dongen ◽  
Evangelos Papastergios ◽  
...  

AbstractDNA methylation, which is modulated by both genetic factors and environmental exposures, may offer a unique opportunity to discover novel biomarkers of disease-related brain phenotypes, even when measured in other tissues than brain, such as blood. A few studies of small sample sizes have revealed associations between blood DNA methylation and neuropsychopathology, however, large-scale epigenome-wide association studies (EWAS) are needed to investigate the utility of DNA methylation profiling as a peripheral marker for the brain. Here, in an analysis of eleven international cohorts, totalling 3337 individuals, we report epigenome-wide meta-analyses of blood DNA methylation with volumes of the hippocampus, thalamus and nucleus accumbens (NAcc)—three subcortical regions selected for their associations with disease and heritability and volumetric variability. Analyses of individual CpGs revealed genome-wide significant associations with hippocampal volume at two loci. No significant associations were found for analyses of thalamus and nucleus accumbens volumes. Cluster-based analyses revealed additional differentially methylated regions (DMRs) associated with hippocampal volume. DNA methylation at these loci affected expression of proximal genes involved in learning and memory, stem cell maintenance and differentiation, fatty acid metabolism and type-2 diabetes. These DNA methylation marks, their interaction with genetic variants and their impact on gene expression offer new insights into the relationship between epigenetic variation and brain structure and may provide the basis for biomarker discovery in neurodegeneration and neuropsychiatric conditions.


2006 ◽  
Vol 45 (04) ◽  
pp. 430-434 ◽  
Author(s):  
G. Dahmen ◽  
A. Ziegler

Summary Objectives: The application of independence estimating equations (IEE) for controlled clinical trials (CCTs) has recently been discussed, and recommendations for its use have been derived for testing hypotheses. The robust estimator of variance has been shown to be liberal for small sample sizes. Therefore a series of modifications has been proposed. In this paper we systematically compare confidence intervals (CIs) proposed in the literature for situations that are common in CCTs. Methods: Using Monte-Carlo simulation studies, we compared the coverage probabilities of CIs and non-convergence probabilities for the parameters of the mean structure for small samples using modifications of the variance estimator proposed by Mancl and de Rouen [7], Morel et al. [8] and Pan [3]. Results: None of the proposed modifications behave well in each investigated situation. For parallel group designs with repeated measurements and binary response the method proposed by Pan maintains the nominal level. We observed non-convergence of the IEE algorithm in up to 10% of the replicates depending on response probabilities in the treatment groups. For comparing slopes with continuous responses, the approach of Morel et al. can be recommended. Conclusions: Results of non-convergence probabilities show that IEE should not be used in parallel group designs with binary endpoints and response probabilities close to 0 or 1. Modifications of the robust variance estimator should be used for sample sizes up to 100 clusters for CI estimation.


2019 ◽  
Vol 41 (5) ◽  
pp. 1011-1017 ◽  
Author(s):  
Florence Tilling ◽  
Andrea E. Cavanna

Abstract Background Tourette syndrome (TS) is a neurodevelopmental condition characterized by the presence of multiple motor and phonic tics, often associated with co-morbid behavioural problems. Tics can be modulated by environmental factors and are characteristically exacerbated by psychological stress, among other factors. This observation has led to the development of specific behavioural treatment strategies, including relaxation therapy. Objective This review aimed to assess the efficacy of relaxation therapy to control or reduce tic symptoms in patients with TS. Methods We conducted a systematic literature review of original studies on the major scientific databases, including Medline, EMBASE, and PsycInfo, according to the standards outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Outcomes measures included both tic severity and tic frequency. Results Our literature search identified three controlled trials, with a total number of 40 participants (range: 6–18 participants). In all three studies, relaxation therapy decreased the severity and/or the frequency of tic symptoms. However, the only trial comparing relaxation therapy to two other behavioural techniques found relaxation therapy to be the least effective intervention, as it reduced the number of tics by 32% compared to 44% with self-monitoring and 55% with habit reversal. Discussion The results of this systematic literature review provide initial evidence for the use of relaxation therapy as a behavioural treatment intervention for tics in patients with TS. Caution is needed in the interpretation of these findings, because the reviewed trials had small sample sizes and there was high heterogeneity across the study protocols.


2019 ◽  
Vol 71 (2) ◽  
pp. 433-439
Author(s):  
Adam G Stewart ◽  
Patrick N A Harris ◽  
Mark Chatfield ◽  
Scott R Evans ◽  
David van Duin ◽  
...  

Abstract Treatment options for multidrug-resistant (MDR) gram-negative infection are growing. However, postregistration, pragmatic, and clinician-led clinical trials in this field are few, recruit small sample sizes, and experience deficiencies in design and operations. MDR gram-negative therapeutic trials are often inefficient, only evaluating a single antibiotic or strategy at a time. Novel clinical trial designs offer potential solutions by attempting to obtain clinically meaningful conclusions at the end or during a trial, for many treatment strategies, simultaneously. An integrated, consensus approach to MDR gram-negative infection trial design is crucial.


Sign in / Sign up

Export Citation Format

Share Document