Sample Size Justification in Phase III/IV Clinical Trials

1998 ◽  
Vol 17 (2) ◽  
pp. 63-66 ◽  
Author(s):  
DeJuran Richardson ◽  
Sue Leurgans
1996 ◽  
Vol 14 (4) ◽  
pp. 1364-1370 ◽  
Author(s):  
S L George

PURPOSE To discuss patient eligibility criteria in phase III cancer clinical trials in the larger setting of the complexity of these trials, to review the various reasons for imposing restrictive eligibility requirements, to discuss the problems caused by these requirements, to argue that these requirements should be greatly relaxed in most cancer clinical trials, to provide some guiding principles and practical suggestions to facilitate such a relaxation, and to give an example of how eligibility requirements were reduced in a recent clinical trial in acute lymphocytic leukemia. METHODS Implicit and explicit reasons for including eligibility criteria in clinical trials are reviewed. Safety concerns and sample size issues receive special attention. The types of problems restrictive eligibility criteria cause with respect to scientific interpretation, medical applicability, complexity, costs, and patient accrual are described. RESULTS A list of three items that each eligibility criterion should meet in order to be included is proposed and applied to a recent trial in acute lymphocytic leukemia. CONCLUSION Phase III clinical trials in cancer should have much broader eligibility criteria than the traditionally restrictive criteria commonly used. Adoption of less restrictive eligibility criteria for most studies would allow broader generalizations, better mimic medical practice, reduce complexity and costs, and permit more rapid accrual without compromising patient safety or requiring major increases in sample size.


2016 ◽  
Vol 14 (1) ◽  
pp. 48-58 ◽  
Author(s):  
Qiang Zhang ◽  
Boris Freidlin ◽  
Edward L Korn ◽  
Susan Halabi ◽  
Sumithra Mandrekar ◽  
...  

Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.


2015 ◽  
Author(s):  
William Meurer ◽  
Nicholas J. Seewald ◽  
Kelley Kidwell

AbstractBackground: Modern clinical trials in stroke reperfusion fall into two categories: alternative systemic pharmacological regimens to alteplase and "rescue" endovascular approaches using targeted thrombectomy devices and/or medications delivered directly for persistently vessel occlusions. Clinical trials in stroke have not evaluated how initial pharmacological thrombolytic management might influence subsequent rescue strategy. A sequential multiple assignment randomized trial (SMART) is a novel trial design that can test these dynamic treatment regimens and lead to treatment guidelines which more closely mimic practice.Aim: To characterize a SMART design in comparison to traditional approaches for stroke reperfusion trials.Methods: We conducted a numerical simulation study that evaluated the performance of contrasting acute stroke clinical trial designs of both initial reperfusion and rescue therapy. We compare a SMART design where the same patients are followed through initial reperfusion and rescue therapy within one trial to a standard phase III design comparing two reperfusion treatments and a separate phase II futility design of rescue therapy in terms of sample size, power, and ability to address particular research questions.Results: Traditional trial designs can be well powered and have optimal design characteristics for independent treatment effects. When treatments, such as the reperfusion and rescue therapies, may interact, commonly used designs fail to detect this. A SMART design, with similar sample size to standard designs, can detect treatment interactions.Conclusions: The use of SMART designs to investigate effective and realistic dynamic treatment regimens is a promising way to accelerate the discovery of new, effective treatments for stroke.


2010 ◽  
Vol 28 (11) ◽  
pp. 1936-1941 ◽  
Author(s):  
Hui Tang ◽  
Nathan R. Foster ◽  
Axel Grothey ◽  
Stephen M. Ansell ◽  
Richard M. Goldberg ◽  
...  

PurposeTo improve the understanding of the appropriate design of phase II oncology clinical trials, we compared error rates in single-arm, historically controlled and randomized, concurrently controlled designs.Patients and MethodsWe simulated error rates of both designs separately from individual patient data from a large colorectal cancer phase III trials and statistical models, which take into account random and systematic variation in historical control data.ResultsIn single-arm trials, false-positive error rates (type I error) were 2 to 4 times those projected when modest drift or patient selection effects (eg, 5% absolute shift in control response rate) were included in statistical models. The power of single-arm designs simulated using actual data was highly sensitive to the fraction of patients from treatment centers with high versus low patient volumes, the presence of patient selection effects or temporal drift in response rates, and random small-sample variation in historical controls. Increasing sample size did not correct the over optimism of single-arm studies. Randomized two-arm design conformed to planned error rates.ConclusionVariability in historical control success rates, outcome drifts in patient populations over time, and/or patient selection effects can result in inaccurate false-positive and false-negative error rates in single-arm designs, but leave performance of the randomized two-arm design largely unaffected at the cost of 2 to 4 times the sample size compared with single-arm designs. Given a large enough patient pool, the randomized phase II designs provide a more accurate decision for screening agents before phase III testing.


Trials ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Kentaro Sakamaki ◽  
Yukari Uemura ◽  
Yosuke Shimizu

Abstract Background There are several challenges in designing clinical trials for the treatment of novel infectious diseases, such as COVID-19. In particular, the definition of endpoints related to the severity, time frame, and clinical course remains unclear. Therefore, we conducted a cross-sectional analysis of phase III randomized trials for COVID-19 registered at ClinicalTrials.gov. Methods We collected the data from ClinicalTrials.gov on March 31, 2021, by specifying the following search conditions under Advanced Search: Condition or disease: (COVID-19) OR (SARS-CoV-2); Study type: Interventional Studies; Study Results: All Studies; Recruitment: Not yet recruiting, Recruiting, Enrolling by invitation, Active, Not recruiting, Suspended, Completed; Sex: All; and Phase: Phase 3. From the downloaded search results, we selected trials that met the following criteria: Primary Purpose: Treatment; Allocation: Randomized. We manually transcribed information not included in the downloaded file, such as Primary Outcome Measures, Secondary Outcome Measures, Time Frame, and Inclusion Criteria. In the analysis, we examined primary and secondary endpoints in trials with severe and non-severe patients, including the types of endpoints, time frame, clinical course, and sample size. Results A total of 406 trials were included in the analysis. The median numbers of endpoints in trials with severe and non-severe patients were 9 and 7, respectively. Approximately 25% of the trials used multiple primary endpoints. Regardless of the type of endpoint, the time frames were longer in the trials with severe patients than in the trials with non-severe patients. In the evaluation of the clinical course, worsening was often considered in binary endpoints, and improvement was considered in time-to-event endpoints. The sample size was the largest in clinical trials using binary endpoints. Conclusions Endpoints can differ with respect to severity, and the clinical course and time frame are important for defining endpoints. This study provides information that can facilitate the achievement of a consensus for the endpoints in evaluating COVID-19 treatments.


1999 ◽  
Vol 25 (3) ◽  
pp. 244-250 ◽  
Author(s):  
D. Curran ◽  
R.J. Sylvester ◽  
G. Hoctin Boes

2016 ◽  
Vol 12 ◽  
pp. P1015-P1015
Author(s):  
Guoqiao Wang ◽  
Eric McDade ◽  
Randall Bateman ◽  
Jason Hassenstab ◽  
Martin R. Farlow ◽  
...  

2018 ◽  
Vol 31 (3) ◽  
pp. e100011
Author(s):  
Hongyue Wang ◽  
Bokai Wang ◽  
Xin M Tu ◽  
Jinyuan Liu ◽  
Changyong Feng

Sample size justification is a very crucial part in the design of clinical trials. In this paper, the authors derive a new formula to calculate the sample size for a binary outcome given one of the three popular indices of risk difference. The sample size based on the absolute difference is the fundamental one, which can be easily used to derive sample size given the risk ratio or OR.


Sign in / Sign up

Export Citation Format

Share Document