scholarly journals On sample size calculation in testing treatment efficacy in clinical trials

2021 ◽  
Vol 58 (2) ◽  
pp. 133-147
Author(s):  
Rownak Jahan Tamanna ◽  
M. Iftakhar Alam ◽  
Ahmed Hossain ◽  
Md Hasinur Rahaman Khan

Summary Sample size calculation is an integral part of any clinical trial design, and determining the optimal sample size for a study ensures adequate power to detect statistical significance. It is a critical step in designing a planned research protocol, since using too many participants in a study is expensive, exposing more subjects to the procedure. If a study is underpowered, it will be statistically inconclusive and may cause the whole protocol to fail. Amidst the attempt to maximize power and the underlying effort to minimize the budget, the optimization of both has become a significant issue in the determination of sample size for clinical trials in recent decades. Although it is hard to generalize a single method for sample size calculation, this study is an attempt to offer something that might be a basis for finding a permanent answer to the contradictions of sample size determination, by the use of simulation studies under simple random and cluster sampling schemes, with different sizes of power and type I error. The effective sample size is much higher when the design effect of the sampling method is smaller, particularly less than 1. Sample size increases for cluster sampling when the number of clusters increases.

2018 ◽  
Vol 28 (7) ◽  
pp. 2179-2195 ◽  
Author(s):  
Chieh Chiang ◽  
Chin-Fu Hsiao

Multiregional clinical trials have been accepted in recent years as a useful means of accelerating the development of new drugs and abridging their approval time. The statistical properties of multiregional clinical trials are being widely discussed. In practice, variance of a continuous response may be different from region to region, but it leads to the assessment of the efficacy response falling into a Behrens–Fisher problem—there is no exact testing or interval estimator for mean difference with unequal variances. As a solution, this study applies interval estimations of the efficacy response based on Howe’s, Cochran–Cox’s, and Satterthwaite’s approximations, which have been shown to have well-controlled type I error rates. However, the traditional sample size determination cannot be applied to the interval estimators. The sample size determination to achieve a desired power based on these interval estimators is then presented. Moreover, the consistency criteria suggested by the Japanese Ministry of Health, Labour and Welfare guidance to decide whether the overall results from the multiregional clinical trial obtained via the proposed interval estimation were also applied. A real example is used to illustrate the proposed method. The results of simulation studies indicate that the proposed method can correctly determine the required sample size and evaluate the assurance probability of the consistency criteria.


2001 ◽  
Vol 95 (5) ◽  
pp. 1068-1073 ◽  
Author(s):  
Hwee Leng Pua ◽  
Jerrold Lerman ◽  
Mark W. Crawford ◽  
James G. Wright

Background The authors evaluated the quality of clinical trials published in four anesthesia journals during the 20-yr period from 1981-2000. Methods Trials published in four major anesthesia journals during the periods 1981-1985, 1991-1995, and the first 6 months of 2000 were grouped according to journal and year. Using random number tables, four trials were selected from all of the eligible clinical trials in each journal in each year for the periods 1981-1985 and 1991-1995, and five trials were selected from all of the trials in each journal in the first 6 months of 2000. Methods and results sections from the 160 trials from 1981-1985 and 1991-1995 were randomly ordered and distributed to three of the authors for blinded review of the quality of the study design according to 10 predetermined criteria (weighted equally, maximum score of 10): informed consent and ethics approval, eligibility criteria, sample size calculation, random allocation, method of randomization, blind assessment of outcome, adverse outcomes, statistical analysis, type I error, and type II error. After these trials were evaluated, 20 trials from the first 6 months of 2000 were randomly ordered, distributed, and evaluated as described. Results The mean (+/- SD) analysis scores pooled for the four journals increased from 5.5 +/- 1.4 in 1981-1985 to 7.0 +/- 1.1 in 1991-1995 (P < 0.00001) and to 7.8 +/- 1.5 in 2000. For 7 of the 10 criteria, the percentage of trials from the four journals that fulfilled the criteria increased significantly between 1981-1985 and 1991-1995. During the 20-yr period, the reporting of sample size calculation and method of randomization increased threefold to fourfold, whereas the frequency of type I statistical errors remained unchanged. Conclusion Although the quality of clinical trials in four major anesthesia journals has increased steadily during the past two decades, specific areas of trial methodology require further attention.


2016 ◽  
Vol 27 (7) ◽  
pp. 2132-2141 ◽  
Author(s):  
Guogen Shan

In an agreement test between two raters with binary endpoints, existing methods for sample size calculation are always based on asymptotic approaches that use limiting distributions of a test statistic under null and alternative hypotheses. These calculated sample sizes may be not reliable due to the unsatisfactory type I error control of asymptotic approaches. We propose a new sample size calculation based on exact approaches which control for the type I error rate. The two exact approaches are considered: one approach based on maximization and the other based on estimation and maximization. We found that the latter approach is generally more powerful than the one based on maximization. Therefore, we present the sample size calculation based on estimation and maximization. A real example from a clinical trial to diagnose low back pain of patients is used to illustrate the two exact testing procedures and sample size determination.


2020 ◽  
Vol 6 (2) ◽  
pp. 106-113
Author(s):  
A. M. Grjibovski ◽  
M. A. Gorbatova ◽  
A. N. Narkevich ◽  
K. A. Vinogradov

Sample size calculation in a planning phase is still uncommon in Russian research practice. This situation threatens validity of the conclusions and may introduce Type I error when the false null hypothesis is accepted due to lack of statistical power to detect the existing difference between the means. Comparing two means using unpaired Students’ ttests is the most common statistical procedure in the Russian biomedical literature. However, calculations of the minimal required sample size or retrospective calculation of the statistical power were observed only in very few publications. In this paper we demonstrate how to calculate required sample size for comparing means in unpaired samples using WinPepi and Stata software. In addition, we produced tables for minimal required sample size for studies when two means have to be compared and body mass index and blood pressure are the variables of interest. The tables were constructed for unpaired samples for different levels of statistical power and standard deviations obtained from the literature.


2019 ◽  
Author(s):  
Emma Wang ◽  
Bernard North ◽  
Peter Sasieni

Abstract Abstract Background Rare and uncommon diseases are difficult to study in clinical trials due to limited recruitment. If the incidence of the disease is very low, international collaboration can only solve the problem to a certain extent. A consequence is a disproportionately high number of deaths from rare diseases, due to unclear knowledge of the best way to treat patients suffering from these diseases. Hypothesis testing using the conventional Type I error in conjunction with the number of patients who can realistically be enrolled for a rare disease, would cause the trial to be severely underpowered. Methods Our proposed method recognises these pragmatic limitations and suggests a new testing procedure, wherein conclusion of efficacy of one arm is grounded in robust evidence of non-inferiority in the endpoint of interest, and reasonable evidence of superiority, over the other arm. Results Simulations were conducted to illustrate the gains in statistical power compared with conventional hypothesis testing in several statistical settings as well as the example of clinical trials for Merkel cell carcinoma, a rare skin tumour. Conclusions Our proposed analysis method enables conducting clinical trials for rare diseases, potentially leading to better standard of care for patients suffering from rare diseases


2020 ◽  
Author(s):  
Guosheng Yin ◽  
Chenyang Zhang ◽  
Huaqing Jin

BACKGROUND Recently, three randomized clinical trials on coronavirus disease (COVID-19) treatments were completed: one for lopinavir-ritonavir and two for remdesivir. One trial reported that remdesivir was superior to placebo in shortening the time to recovery, while the other two showed no benefit of the treatment under investigation. OBJECTIVE The aim of this paper is to, from a statistical perspective, identify several key issues in the design and analysis of three COVID-19 trials and reanalyze the data from the cumulative incidence curves in the three trials using more appropriate statistical methods. METHODS The lopinavir-ritonavir trial enrolled 39 additional patients due to insignificant results after the sample size reached the planned number, which led to inflation of the type I error rate. The remdesivir trial of Wang et al failed to reach the planned sample size due to a lack of eligible patients, and the bootstrap method was used to predict the quantity of clinical interest conditionally and unconditionally if the trial had continued to reach the originally planned sample size. Moreover, we used a terminal (or cure) rate model and a model-free metric known as the restricted mean survival time or the restricted mean time to improvement (RMTI) to analyze the reconstructed data. The remdesivir trial of Beigel et al reported the median recovery time of the remdesivir and placebo groups, and the rate ratio for recovery, while both quantities depend on a particular time point representing local information. We use the restricted mean time to recovery (RMTR) as a global and robust measure for efficacy. RESULTS For the lopinavir-ritonavir trial, with the increase of sample size from 160 to 199, the type I error rate was inflated from 0.05 to 0.071. The difference of RMTIs between the two groups evaluated at day 28 was –1.67 days (95% CI –3.62 to 0.28; <i>P</i>=.09) in favor of lopinavir-ritonavir but not statistically significant. For the remdesivir trial of Wang et al, the difference of RMTIs at day 28 was –0.89 days (95% CI –2.84 to 1.06; <i>P</i>=.37). The planned sample size was 453, yet only 236 patients were enrolled. The conditional prediction shows that the hazard ratio estimates would reach statistical significance if the target sample size had been maintained. For the remdesivir trial of Beigel et al, the difference of RMTRs between the remdesivir and placebo groups at day 30 was –2.7 days (95% CI –4.0 to –1.2; <i>P</i>&lt;.001), confirming the superiority of remdesivir. The difference in the recovery time at the 25th percentile (95% CI –3 to 0; <i>P</i>=.65) was insignificant, while the differences became more statistically significant at larger percentiles. CONCLUSIONS Based on the statistical issues and lessons learned from the recent three clinical trials on COVID-19 treatments, we suggest more appropriate approaches for the design and analysis of ongoing and future COVID-19 trials.


Author(s):  
Natcha Mahapoonyanont ◽  
Suwichaya Putuptim

The power of test is the probability that the test rejects the null hypothesis (H0) when a specific alternative hypothesis (H1) is true. The probability of occurrence of a type I error is modelled on medical research that tried to avoid the type I error, such as testing of new medicines, etc. The statistical significance level must be set to be as small as possible, and the probability of type II error would be considered later. In behavioural sciences and social sciences research, the researcher wants to avoid a type I error by determining the level of statistical significance. There are arguments of statistical significance could affect the errors of the findings. Independent variables may have a real influence on the dependent variables but the researcher could not detect them because of statistical significance was setting at the low level. Therefore, in some situations, more attention should be paid to the occurrence of the type II error, and less interest in type I error. This may demonstrate more realistic and valid results. The objectives of this research were to compare of the power of test on t – test under the condition of different sample size (n; 30, 60, 90), statistical significance (sig; .001, .01, .05), and type of data (real data, transformed data, simulation data (Monte Carlo Simulation Technique)). The research findings provide significant information for researcher that is useful for further research using t-test, to improve the accuracy of research findings.


2013 ◽  
Vol 31 (15_suppl) ◽  
pp. 6576-6576
Author(s):  
Satoshi Teramukai ◽  
Takashi Daimon ◽  
Sarah Zohar

6576 Background: The aim of phase II trials is to determine if a new treatment is promising for further testing in confirmatory clinical trials. Most phase II clinical trials are designed as single-arm trials using a binary outcome with or without interim monitoring for early stopping. In this context, we propose a Bayesian adaptive design denoted as PSSD, predictive sample size selection design (Statistics in Medicine 2012;31:4243-4254). Methods: The design allows for sample size selection followed by any planned interim analyses for early stopping of a trial, together with sample size determination before starting the trial. In the PSSD, we determined the sample size using the predictive probability criterion with two kinds of prior distributions, that is, an ‘analysis prior’ used to compute posterior probabilities and a ‘design prior’ used to obtain prior predictive distributions. In the sample size determination, we provide two sample sizes, that is, N and Nmax, using two types of design priors. At each interim analysis, we calculate the predictive probability of achieving a successful result at the end of the trial using analysis prior in order to stop the trial in case of low or high efficacy, and we select an optimal sample size, that is, either N or Nmax as needed, on the basis of the predictive probabilities. Results: We investigated the operating characteristics through simulation studies, and the PSSD retrospectively applies to a lung cancer clinical trial. As the number of interim looks increases, the probability of type I errors slightly decreases, and that of type II errors increases. The type I error probabilities of the probabilities of the proposed PSSD are almost similar to those of the non-adaptive design. The type II error probabilities in the PSSD are between those of the two fixed sample size (N or Nmax) designs. Conclusions: From a practical standpoint, the proposed design could be useful in phase II single-arm clinical trials with a binary endpoint. In the near future, this approach will be implemented in actual clinical trials to assess its usefulness and to extend it to more complicated clinical trials.


Sign in / Sign up

Export Citation Format

Share Document