Comparing phase 3 “go” decisions (Ph3-GO) between single arm trials with real-world external control (SAT+rwEC) versus randomized phase 2 trials (rPh2).

2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e13564-e13564
Author(s):  
Brian Hobbs ◽  
Thanh Ton ◽  
Xiao Li ◽  
David S. Hong ◽  
Rebecca A. Hubbard ◽  
...  

e13564 Background: Traditional rPh2 trials have limitations that may yield suboptimal Ph3-GO. Compared to a rPh2 of equivalent sample size, SAT+rwEC allows more patients to receive experimental therapies while preserving the ability to compare experimental and control groups. Bias arising from measurement error and confounding in the rwEC, however, poses challenges to statistical inference. Preliminary studies suggest higher response rates are observed in rwEC than randomized controls. We compared Ph3-GO decisions between SAT+rwEC and rPh2. Methods: Ph3-GO probability was compared using simulation studies that resembled the oncology setting with objective response rate (ORR) endpoint. rPh2 simulated parameters were: sample size (60-120) with 1:1 randomization, ORR in rPh2 control (15%-50%), true treatment effect (ΔORR: 0-50). For each rPh2 of a given sample size, we evaluated an SAT+rwEC that re-allocated all rPh2 control patients to the experimental arm (i.e., doubling the sample size of the experimental arm) and added an rwEC. SAT+rwEC were simulated with assumptions for size (rwEC to SAT ratio: 0.5 to 2) and net bias (-10 to +10), which was simulated as a composite representing ORR measurement error plus residual confounding after multivariable adjustment. Positive direction of net bias corresponds to higher ORR in the rwEC. Ph3-GO thresholds varied from 10-30%. Ph3-GO was considered “False-GO” when true treatment effect < threshold, and “True-GO” when true treatment effect ≥ threshold. Results: With positive net bias of +10, SAT+rwEC had lower False-GO and True-GO decisions compared to rPh2. With negative net bias of -10, both False-GO and True-GO probabilities were higher for the SAT+rwEC. When net bias=0, the increased size of SAT+rwEC resulted in observable Ph3-GO improvements with lower False-GO and higher True-GO than corresponding rPh2. Conclusions: An interactive dashboard was developed for users. The magnitude and direction of net bias relative to the decision threshold affect the performance of SAT+rwEC. The relative sample size of rwEC to rPh2 may also impact performance. The dashboard can provide quantitative guidance for Ph3-GO if net bias can be estimated by independent studies. Further work to quantify net bias and refine Ph3-GO criteria can help reduce the currently high False-GO rates while increasing opportunities for patients to receive experimental therapies through the SAT+rwEC design. Ph3-GO probability for rPh2 vs. SAT+rwEC with threshold=15%, baseline ORR=20% (select scenarios).[Table: see text]

2007 ◽  
Vol 25 (18_suppl) ◽  
pp. 6513-6513
Author(s):  
R. A. Wilcox ◽  
G. H. Guyatt ◽  
V. M. Montori

6513 Background: Investigators finding a large treatment effect in an interim analysis may terminate a randomized trial (RCT) earlier than planned. A systematic review (Montori et. al., JAMA 2005; 294: 2203–2209) found that RCTs stopped early for benefit are poorly reported and may overestimate the true treatment affect. The extent to which RCTs in oncology stopped early for benefit share similar concerns remains unclear. Methods: We selected RCTs in oncology which had been reported in the original systematic review and reviewed the study characteristics, features related to the decision to monitor and stop the study early (sample size, interim analyses, monitoring and stopping rules), and the number of events and the estimated treatment effects. Results: We found 29 RCTs in malignant hematology (n=6) and oncology (n=23), 52% published in 2000–2004 and 41% in 3 high-impact medical journals (New England Journal of Medicine, Lancet, JAMA). The majority (79%) of trials reported a planned sample size and, on average, recruited 67% of the planned sample size (SD 31%). RCTs reported (1) the planned sample size (n=20), (2) the interim analysis at which the study was terminated (n=16), and (3) whether the decision to stop the study prematurely was informed by a stopping rule (n=16); only 13 reported all three. There was a highly significant correlation between the number of events and the treatment effect (r=0.68, p=0.0007). The odds of finding a large treatment effect (a relative risk < median of 0.54, IQR 0.3–0.7) when studies stopped after a few events (no. events < median of 54 events, IQR 22–125) was 6.2 times greater than when studies stopped later. Conclusions: RCTs in oncology stopped early for benefit tend to report large treatment effects that may overestimate the true treatment effect, particularly when the number of events driving study termination is small. Also, information pertinent to the decision to stop early was inconsistently reported. Clinicians and policymakers should interpret such studies with caution, especially when information about the decision to stop early is not provided and few events occurred. No significant financial relationships to disclose.


Author(s):  
David Aaby ◽  
Juned Siddique

Abstract Background Lifestyle intervention studies often use self-reported measures of diet as an outcome variable to measure changes in dietary intake. The presence of measurement error in self-reported diet due to participant failure to accurately report their diet is well known. Less familiar to researchers is differential measurement error, where the nature of measurement error differs by treatment group and/or time. Differential measurement error is often present in intervention studies and can result in biased estimates of the treatment effect and reduced power to detect treatment effects. Investigators need to be aware of the impact of differential measurement error when designing intervention studies that use self-reported measures. Methods We use simulation to assess the consequences of differential measurement error on the ability to estimate treatment effects in a two-arm randomized trial with two time points. We simulate data under a variety of scenarios, focusing on how different factors affect power to detect a treatment effect, bias of the treatment effect, and coverage of the 95% confidence interval of the treatment effect. Simulations use realistic scenarios based on data from the Trials of Hypertension Prevention Study. Simulated sample sizes ranged from 110-380 per group. Results Realistic differential measurement error seen in lifestyle intervention studies can require an increased sample size to achieve 80% power to detect a treatment effect and may result in a biased estimate of the treatment effect. Conclusions Investigators designing intervention studies that use self-reported measures should take differential measurement error into account by increasing their sample size, incorporating an internal validation study, and/or identifying statistical methods to correct for differential measurement error.


Biometrika ◽  
2020 ◽  
Author(s):  
Oliver Dukes ◽  
Stijn Vansteelandt

Summary Eliminating the effect of confounding in observational studies typically involves fitting a model for an outcome adjusted for covariates. When, as often, these covariates are high-dimensional, this necessitates the use of sparse estimators, such as the lasso, or other regularization approaches. Naïve use of such estimators yields confidence intervals for the conditional treatment effect parameter that are not uniformly valid. Moreover, as the number of covariates grows with the sample size, correctly specifying a model for the outcome is nontrivial. In this article we deal with both of these concerns simultaneously, obtaining confidence intervals for conditional treatment effects that are uniformly valid, regardless of whether the outcome model is correct. This is done by incorporating an additional model for the treatment selection mechanism. When both models are correctly specified, we can weaken the standard conditions on model sparsity. Our procedure extends to multivariate treatment effect parameters and complex longitudinal settings.


2017 ◽  
Vol 23 (5) ◽  
pp. 644-646 ◽  
Author(s):  
Maria Pia Sormani

The calculation of the sample size needed for a clinical study is the challenge most frequently put to statisticians, and it is one of the most relevant issues in the study design. The correct size of the study sample optimizes the number of patients needed to get the result, that is, to detect the minimum treatment effect that is clinically relevant. Minimizing the sample size of a study has the advantage of reducing costs, enhancing feasibility, and also has ethical implications. In this brief report, I will explore the main concepts on which the sample size calculation is based.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Georgia Kourlaba ◽  
Eleni Kourkouni ◽  
Stefania Maistreli ◽  
Christina-Grammatiki Tsopela ◽  
Nafsika-Maria Molocha ◽  
...  

Abstract Background Epidemiological data indicate that a large part of population needs to be vaccinated to achieve herd immunity. Hence, it is of high importance for public health officials to know whether people are going to get vaccinated for COVID-19. The objective of the present study was to examine the willingness of adult residents in Greece to receive a COVID-19 vaccine. Methods A cross-sectional was survey conducted among the adult general population of Greece between April 28, 2020 to May 03, 2020 (last week of lockdown), using a mixed methodology for data collection: Computer Assisted Telephone Interviewing (CATI) and Computer Assisted web Interviewing (CAWI). Using a sample size calculator, the target sample size was found to be around 1000 respondents. To ensure a nationally representative sample of the urban/rural population according to the Greek census 2011, a proportionate stratified by region systematic sampling procedure was used to recruit particpants. Data collection was guided through a structured questionnaire. Regarding willingness to COVID-19 vaccination, participants were asked to answer the following question: “If there was a vaccine available for the novel coronavirus, would you do it?” Results Of 1004 respondents only 57.7% stated that they are going to get vaccinated for COVID-19. Respondents aged > 65 years old, those who either themselves or a member of their household belonged to a vulnerable group, those believing that the COVID-19 virus was not developed in laboratories by humans, those believing that coronavirus is far more contagious and lethal compared to the H1N1 virus, and those believing that next waves are coming were statistically significantly more likely to be willing to get a COVID-19 vaccine. Higher knowledge score regarding symptoms, transmission routes and prevention and control measures against COVID-19 was significantly associated with higher willingness of respondents to get vaccinated. Conclusion A significant proportion of individuals in the general population are unwilling to receive a COVID-19 vaccine, stressing the need for public health officials to take immediate awareness-raising measures.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 467-468
Author(s):  
Igor M Ferreira ◽  
Iorrano A Cidrini ◽  
Karla Oliveira ◽  
Luciana Sousa ◽  
Laura F Prados ◽  
...  

Abstract The objective was to evaluate the effects of forage allowance on the performance and pregnancy rate of Nellore heifers submitted to the fixed-time artificial insemination (FTAI) at 14±1 months. The experimental period was divided into the growing phase (GP; 173d) and breeding season (83d). Ninety weaned Nellore heifers [169±19 kg of body weight (BW); 210±28 days of age] were blocked by initial BW and randomly assigned to receive different forage allowance: (1) High (HFA; 7.17 kgDM/kgBW; and (2) Low (LFA; 3.27 kgDM/kgBW); and divided into 6 paddocks (2.15 to 2.4 ha each one). The grazing method used was continuous stocking with a variable stocking rate where treatment conditions were established varying the stocking rate on pasture (put-and-take). The supplement consisted of corn, soybean meal, urea, monensin, sodium chloride and minerals mix (80% TDN and 24% CP) and was provided daily (ad libitum; expected intake around 1% of BW). The final pregnancy rate was determined after two FTAI (d256). Heifers BW at the end of the GP was greater (P &lt; 0.01) for HFA vs. LFA (268 vs. 255 kg). The overall average daily gain (ADG) was greater 68 g/d for animals maintained in HFA than LFA (P &lt; 0.01; 0.713 vs. 0.645 kg/d). Supplement intake (kg/animal/d) had no treatment effect (P = 0.117). On d173, forage allowance affected (P &lt; 0.01) the backfat thickness obtained by real-time carcass ultrasound (HFA = 3.08 vs. LFA = 2.85 mm). The pregnancy rate had no treatment effect (P = 0.223; on average 61%). However, the pregnant heifers per hectare had treatment effect (P = 0.036; HFA = 3.78 vs. LFA = 2.72 pregnant heifers/ha). In conclusion, the low forage allowance reduces the animal performance and affects the carcass composition but provides a greater number of pregnant heifers per hectare.


2018 ◽  
Vol 53 (7) ◽  
pp. 716-719
Author(s):  
Monica R. Lininger ◽  
Bryan L. Riemann

Objective: To describe the concept of statistical power as related to comparative interventions and how various factors, including sample size, affect statistical power.Background: Having a sufficiently sized sample for a study is necessary for an investigation to demonstrate that an effective treatment is statistically superior. Many researchers fail to conduct and report a priori sample-size estimates, which then makes it difficult to interpret nonsignificant results and causes the clinician to question the planning of the research design.Description: Statistical power is the probability of statistically detecting a treatment effect when one truly exists. The α level, a measure of differences between groups, the variability of the data, and the sample size all affect statistical power.Recommendations: Authors should conduct and provide the results of a priori sample-size estimations in the literature. This will assist clinicians in determining whether the lack of a statistically significant treatment effect is due to an underpowered study or to a treatment's actually having no effect.


2020 ◽  
pp. 096228022098078
Author(s):  
Bosheng Li ◽  
Liwen Su ◽  
Jun Gao ◽  
Liyun Jiang ◽  
Fangrong Yan

A delayed treatment effect is often observed in the confirmatory trials for immunotherapies and is reflected by a delayed separation of the survival curves of the immunotherapy groups versus the control groups. This phenomenon makes the design based on the log-rank test not applicable because this design would violate the proportional hazard assumption and cause loss of power. Thus, we propose a group sequential design allowing early termination on the basis of efficacy based on a more powerful piecewise weighted log-rank test for an immunotherapy trial with a delayed treatment effect. We present an approach on the group sequential monitoring, in which the information time is defined based on the number of events occurring after the delay time. Furthermore, we developed a one-dimensional search algorithm to determine the required maximum sample size for the proposed design, which uses an analytical estimation obtained by the inflation factor as an initial value and an empirical power function calculated by a simulation-based procedure as an objective function. In the simulation, we tested the unstable accuracy of the analytical estimation, the consistent accuracy of the maximum sample size determined by the search algorithm and the advantages of the proposed design on saving sample size.


2018 ◽  
Vol 61 (2) ◽  
pp. 210-222 ◽  
Author(s):  
Joseph M Matthes ◽  
A Dwayne Ball

Establishing discriminant validity has been a keystone of measurement validity in empirical marketing research for many decades. Without statistically showing that constructs have discriminant validity, contributions to marketing literature are likely to foster the proliferation of constructs that are operationally the same as other constructs already present in the literature, thus leading to confusion in the development of theory. This article addresses this concern by evaluating well-established methods for testing discriminant validity through the simulation of artificial datasets (containing varying levels of correlation between constructs, sample size, measurement error, and distribution skewness). The artificial data are applied to six commonly used approaches for testing the existence of discriminant validity. Results strongly suggest that several methods are much more likely than others to yield accurate assessments of whether discriminant validity exists, especially under specific conditions. Recommendations for practice in the assessment of discriminant validity are suggested.


Sign in / Sign up

Export Citation Format

Share Document