scholarly journals Power and sample size calculation for log-rank test with a time lag in treatment effect

2009 ◽  
Vol 28 (5) ◽  
pp. 864-879 ◽  
Author(s):  
Daowen Zhang ◽  
Hui Quan
Author(s):  
Patrick Royston

Most randomized controlled trials with a time-to-event outcome are designed and analyzed assuming proportional hazards of the treatment effect. The sample-size calculation is based on a log-rank test or the equivalent Cox test. Nonproportional hazards are seen increasingly in trials and are recognized as a potential threat to the power of the log-rank test. To address the issue, Royston and Parmar (2016, BMC Medical Research Methodology 16: 16) devised a new “combined test” of the global null hypothesis of identical survival curves in each trial arm. The test, which combines the conventional Cox test with a new formulation, is based on the maximal standardized difference in restricted mean survival time (RMST) between the arms. The test statistic is based on evaluations of RMST over several preselected time points. The combined test involves the minimum p-value across the Cox and RMST-based tests, appropriately standardized to have the correct null distribution. In this article, I outline the combined test and introduce a command, stctest, that implements the combined test. I point the way to additional tools currently under development for power and sample-size calculation for the combined test.


Author(s):  
Patrick Royston

Randomized controlled trials with a time-to-event outcome are usually designed and analyzed assuming proportional hazards (PH) of the treatment effect. The sample-size calculation is based on a log-rank test or the nearly identical Cox test, henceforth called the Cox/log-rank test. Nonproportional hazards (non-PH) has become more common in trials and is recognized as a potential threat to interpreting the trial treatment effect and the power of the log-rank test—hence to the success of the trial. To address the issue, in 2016, Royston and Parmar ( BMC Medical Research Methodology 16: 16) proposed a “combined test” of the global null hypothesis of identical survival curves in each trial arm. The Cox/log-rank test is combined with a new test derived from the maximal standardized difference in restricted mean survival time (RMST) between the trial arms. The test statistic is based on evaluations of the between-arm difference in RMST over several preselected time points. The combined test involves the minimum p-value across the Cox/log-rank and RMST-based tests, appropriately standardized to have the correct distribution under the global null hypothesis. In this article, I introduce a new command, power_ct, that uses simulation to implement power and sample-size calculations for the combined test. power_ct supports designs with PH or non-PH of the treatment effect. I provide examples in which the power of the combined test is compared with that of the Cox/log-rank test under PH and non-PH scenarios. I conclude by offering guidance for sample-size calculations in time-to-event trials to allow for possible non-PH.


2014 ◽  
Vol 34 (6) ◽  
pp. 1031-1040 ◽  
Author(s):  
René Schmidt ◽  
Robert Kwiecien ◽  
Andreas Faldum ◽  
Frank Berthold ◽  
Barbara Hero ◽  
...  

2017 ◽  
Vol 23 (5) ◽  
pp. 644-646 ◽  
Author(s):  
Maria Pia Sormani

The calculation of the sample size needed for a clinical study is the challenge most frequently put to statisticians, and it is one of the most relevant issues in the study design. The correct size of the study sample optimizes the number of patients needed to get the result, that is, to detect the minimum treatment effect that is clinically relevant. Minimizing the sample size of a study has the advantage of reducing costs, enhancing feasibility, and also has ethical implications. In this brief report, I will explore the main concepts on which the sample size calculation is based.


2020 ◽  
pp. 096228022098078
Author(s):  
Bosheng Li ◽  
Liwen Su ◽  
Jun Gao ◽  
Liyun Jiang ◽  
Fangrong Yan

A delayed treatment effect is often observed in the confirmatory trials for immunotherapies and is reflected by a delayed separation of the survival curves of the immunotherapy groups versus the control groups. This phenomenon makes the design based on the log-rank test not applicable because this design would violate the proportional hazard assumption and cause loss of power. Thus, we propose a group sequential design allowing early termination on the basis of efficacy based on a more powerful piecewise weighted log-rank test for an immunotherapy trial with a delayed treatment effect. We present an approach on the group sequential monitoring, in which the information time is defined based on the number of events occurring after the delay time. Furthermore, we developed a one-dimensional search algorithm to determine the required maximum sample size for the proposed design, which uses an analytical estimation obtained by the inflation factor as an initial value and an empirical power function calculated by a simulation-based procedure as an objective function. In the simulation, we tested the unstable accuracy of the analytical estimation, the consistent accuracy of the maximum sample size determined by the search algorithm and the advantages of the proposed design on saving sample size.


2019 ◽  
Vol 111 (11) ◽  
pp. 1186-1191 ◽  
Author(s):  
Julien Péron ◽  
Alexandre Lambert ◽  
Stephane Munier ◽  
Brice Ozenne ◽  
Joris Giai ◽  
...  

Abstract Background The treatment effect in survival analysis is commonly quantified as the hazard ratio, and tested statistically using the standard log-rank test. Modern anticancer immunotherapies are successful in a proportion of patients who remain alive even after a long-term follow-up. This new phenomenon induces a nonproportionality of the underlying hazards of death. Methods The properties of the net survival benefit were illustrated using the dataset from a trial evaluating ipilimumab in metastatic melanoma. The net survival benefit was then investigated through simulated datasets under typical scenarios of proportional hazards, delayed treatment effect, and cure rate. The net survival benefit test was computed according to the value of the minimal survival difference considered clinically relevant. As comparators, the standard and the weighted log-rank tests were also performed. Results In the illustrative dataset, the net survival benefit favored ipilimumab [Δ(0) = 15.8%, 95% confidence interval = 4.6% to 27.3%, P = .006]. This favorable effect was maintained when the analysis was focused on long-term survival differences (eg, >12 months, Δ(12) = 12.5% (95% confidence interval = 4.4% to 20.6%, P = .002). Under the scenarios of a delayed treatment effect and cure rate, the power of the net survival benefit test compared favorably to the standard log-rank test power and was comparable to the power of the weighted log-rank test for large values of the threshold of clinical relevance. Conclusion The net long-term survival benefit is a measure of treatment effect that is meaningful whether or not hazards are proportional. The associated statistical test is more powerful than the standard log-rank test when a delayed treatment effect is anticipated.


2018 ◽  
Vol 7 (6) ◽  
pp. 81
Author(s):  
Fang Fang ◽  
Yong Lin ◽  
Weichung Joe Shih ◽  
Shou-En Lu ◽  
Guangrui Zhu

The accuracy of the treatment effect estimation is crucial to the success of Phase 3 studies. The calculation of sample size relies on the treatment effect estimation and cannot be changed during the trial in a fixed sample size design. Oftentimes, with limited efficacy data available from early phase studies and relevant historical studies, the sample size calculation may not accurately reflect the true treatment effect. Several adaptive designs have been proposed to address this uncertainty in the sample size calculation. These adaptive designs provide flexibility of sample size adjustment during the trial by allowing early trial stopping or sample size adjustment at interim look(s). The use of adaptive designs can optimize the trial performance when the treatment effect is an assumed constant value. However in practice, it may be more reasonable to consider the treatment effect within an interval rather than as a point estimate. Because proper selection of adaptive designs may decrease the failure rate of Phase 3 clinical trials and increase the chance for new drug approval, this paper proposes measures and evaluates the performance of different adaptive designs based on treatment effect intervals, and identifies factors that may affect the performance of adaptive designs.


Sign in / Sign up

Export Citation Format

Share Document