Sample size calculation for agreement between two raters with binary endpoints using exact tests

2016 ◽  
Vol 27 (7) ◽  
pp. 2132-2141 ◽  
Author(s):  
Guogen Shan

In an agreement test between two raters with binary endpoints, existing methods for sample size calculation are always based on asymptotic approaches that use limiting distributions of a test statistic under null and alternative hypotheses. These calculated sample sizes may be not reliable due to the unsatisfactory type I error control of asymptotic approaches. We propose a new sample size calculation based on exact approaches which control for the type I error rate. The two exact approaches are considered: one approach based on maximization and the other based on estimation and maximization. We found that the latter approach is generally more powerful than the one based on maximization. Therefore, we present the sample size calculation based on estimation and maximization. A real example from a clinical trial to diagnose low back pain of patients is used to illustrate the two exact testing procedures and sample size determination.

2021 ◽  
Vol 58 (2) ◽  
pp. 133-147
Author(s):  
Rownak Jahan Tamanna ◽  
M. Iftakhar Alam ◽  
Ahmed Hossain ◽  
Md Hasinur Rahaman Khan

Summary Sample size calculation is an integral part of any clinical trial design, and determining the optimal sample size for a study ensures adequate power to detect statistical significance. It is a critical step in designing a planned research protocol, since using too many participants in a study is expensive, exposing more subjects to the procedure. If a study is underpowered, it will be statistically inconclusive and may cause the whole protocol to fail. Amidst the attempt to maximize power and the underlying effort to minimize the budget, the optimization of both has become a significant issue in the determination of sample size for clinical trials in recent decades. Although it is hard to generalize a single method for sample size calculation, this study is an attempt to offer something that might be a basis for finding a permanent answer to the contradictions of sample size determination, by the use of simulation studies under simple random and cluster sampling schemes, with different sizes of power and type I error. The effective sample size is much higher when the design effect of the sampling method is smaller, particularly less than 1. Sample size increases for cluster sampling when the number of clusters increases.


2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Guogen Shan ◽  
Amei Amei ◽  
Daniel Young

Sensitivity and specificity are often used to assess the performance of a diagnostic test with binary outcomes. Wald-type test statistics have been proposed for testing sensitivity and specificity individually. In the presence of a gold standard, simultaneous comparison between two diagnostic tests for noninferiority of sensitivity and specificity based on an asymptotic approach has been studied by Chen et al. (2003). However, the asymptotic approach may suffer from unsatisfactory type I error control as observed from many studies, especially in small to medium sample settings. In this paper, we compare three unconditional approaches for simultaneously testing sensitivity and specificity. They are approaches based on estimation, maximization, and a combination of estimation and maximization. Although the estimation approach does not guarantee type I error, it has satisfactory performance with regard to type I error control. The other two unconditional approaches are exact. The approach based on estimation and maximization is generally more powerful than the approach based on maximization.


2020 ◽  
Vol 6 (2) ◽  
pp. 106-113
Author(s):  
A. M. Grjibovski ◽  
M. A. Gorbatova ◽  
A. N. Narkevich ◽  
K. A. Vinogradov

Sample size calculation in a planning phase is still uncommon in Russian research practice. This situation threatens validity of the conclusions and may introduce Type I error when the false null hypothesis is accepted due to lack of statistical power to detect the existing difference between the means. Comparing two means using unpaired Students’ ttests is the most common statistical procedure in the Russian biomedical literature. However, calculations of the minimal required sample size or retrospective calculation of the statistical power were observed only in very few publications. In this paper we demonstrate how to calculate required sample size for comparing means in unpaired samples using WinPepi and Stata software. In addition, we produced tables for minimal required sample size for studies when two means have to be compared and body mass index and blood pressure are the variables of interest. The tables were constructed for unpaired samples for different levels of statistical power and standard deviations obtained from the literature.


2018 ◽  
Vol 28 (7) ◽  
pp. 2179-2195 ◽  
Author(s):  
Chieh Chiang ◽  
Chin-Fu Hsiao

Multiregional clinical trials have been accepted in recent years as a useful means of accelerating the development of new drugs and abridging their approval time. The statistical properties of multiregional clinical trials are being widely discussed. In practice, variance of a continuous response may be different from region to region, but it leads to the assessment of the efficacy response falling into a Behrens–Fisher problem—there is no exact testing or interval estimator for mean difference with unequal variances. As a solution, this study applies interval estimations of the efficacy response based on Howe’s, Cochran–Cox’s, and Satterthwaite’s approximations, which have been shown to have well-controlled type I error rates. However, the traditional sample size determination cannot be applied to the interval estimators. The sample size determination to achieve a desired power based on these interval estimators is then presented. Moreover, the consistency criteria suggested by the Japanese Ministry of Health, Labour and Welfare guidance to decide whether the overall results from the multiregional clinical trial obtained via the proposed interval estimation were also applied. A real example is used to illustrate the proposed method. The results of simulation studies indicate that the proposed method can correctly determine the required sample size and evaluate the assurance probability of the consistency criteria.


2020 ◽  
Vol 29 (10) ◽  
pp. 2958-2971 ◽  
Author(s):  
Maria Stark ◽  
Antonia Zapf

Introduction In a confirmatory diagnostic accuracy study, sensitivity and specificity are considered as co-primary endpoints. For the sample size calculation, the prevalence of the target population must be taken into account to obtain a representative sample. In this context, a general problem arises. With a low or high prevalence, the study may be overpowered in one subpopulation. One further issue is the correct pre-specification of the true prevalence. With an incorrect assumption about the prevalence, an over- or underestimated sample size will result. Methods To obtain the desired power independent of the prevalence, a method for an optimal sample size calculation for the comparison of a diagnostic experimental test with a prespecified minimum sensitivity and specificity is proposed. To face the problem of an incorrectly pre-specified prevalence, a blinded one-time re-estimation design of the sample size based on the prevalence and a blinded repeated re-estimation design of the sample size based on the prevalence are evaluated by a simulation study. Both designs are compared to a fixed design and additionally among each other. Results The type I error rates of both blinded re-estimation designs are not inflated. Their empirical overall power equals the desired theoretical power and both designs offer unbiased estimates of the prevalence. The repeated re-estimation design reveals no advantages concerning the mean squared error of the re-estimated prevalence or sample size compared to the one-time re-estimation design. The appropriate size of the internal pilot study in the one-time re-estimation design is 50% of the initially calculated sample size. Conclusions A one-time re-estimation design of the prevalence based on the optimal sample size calculation is recommended in single-arm diagnostic accuracy studies.


2001 ◽  
Vol 95 (5) ◽  
pp. 1068-1073 ◽  
Author(s):  
Hwee Leng Pua ◽  
Jerrold Lerman ◽  
Mark W. Crawford ◽  
James G. Wright

Background The authors evaluated the quality of clinical trials published in four anesthesia journals during the 20-yr period from 1981-2000. Methods Trials published in four major anesthesia journals during the periods 1981-1985, 1991-1995, and the first 6 months of 2000 were grouped according to journal and year. Using random number tables, four trials were selected from all of the eligible clinical trials in each journal in each year for the periods 1981-1985 and 1991-1995, and five trials were selected from all of the trials in each journal in the first 6 months of 2000. Methods and results sections from the 160 trials from 1981-1985 and 1991-1995 were randomly ordered and distributed to three of the authors for blinded review of the quality of the study design according to 10 predetermined criteria (weighted equally, maximum score of 10): informed consent and ethics approval, eligibility criteria, sample size calculation, random allocation, method of randomization, blind assessment of outcome, adverse outcomes, statistical analysis, type I error, and type II error. After these trials were evaluated, 20 trials from the first 6 months of 2000 were randomly ordered, distributed, and evaluated as described. Results The mean (+/- SD) analysis scores pooled for the four journals increased from 5.5 +/- 1.4 in 1981-1985 to 7.0 +/- 1.1 in 1991-1995 (P < 0.00001) and to 7.8 +/- 1.5 in 2000. For 7 of the 10 criteria, the percentage of trials from the four journals that fulfilled the criteria increased significantly between 1981-1985 and 1991-1995. During the 20-yr period, the reporting of sample size calculation and method of randomization increased threefold to fourfold, whereas the frequency of type I statistical errors remained unchanged. Conclusion Although the quality of clinical trials in four major anesthesia journals has increased steadily during the past two decades, specific areas of trial methodology require further attention.


2020 ◽  
Vol 29 (10) ◽  
pp. 2814-2829
Author(s):  
Laura Kerschke ◽  
Andreas Faldum ◽  
Rene Schmidt

The one-sample log-rank test allows to compare the survival of a single sample with a prefixed reference survival curve. It naturally applies in single-arm phase IIa trials with time-to-event endpoint. Several authors have described that the original one-sample log-rank test is conservative when sample size is small and have proposed strategies to correct the conservativeness. Here, we propose an alternative approach to improve the one-sample log-rank test. Our new one-sample log-rank statistic is based on the unique transformation of the underlying counting process martingale such that the moments of the limiting normal distribution have no shared parameters. Simulation results show that the new one-sample log-rank test gives type I error rate and power close to the nominal levels also when sample size is small, while relevantly reducing the required sample size to achieve the desired power as compared to current approaches to design studies to compare the survival outcome of a sample with a reference.


2021 ◽  
Vol 28 (2) ◽  
pp. 15-27
Author(s):  
Mohamad Adam Bujang

Determination of a minimum sample size required for a study is a major consideration which all researchers are confronted with at the early stage of developing a research protocol. This is because the researcher will need to have a sound prerequisite knowledge of inferential statistics in order to enable him/her to acquire a thorough understanding of the overall concept of a minimum sample size requirement and its estimation. Besides type I error and power of the study, some estimates for effect sizes will also need to be determined in the process to calculate or estimate the sample size. The appropriateness in calculating or estimating the sample size will enable the researchers to better plan their study especially pertaining to recruitment of subjects. To facilitate a researcher in estimating the appropriate sample size for their study, this article provides some recommendations for researchers on how to determine the appropriate sample size for their studies. In addition, several issues related to sample size determination were also discussed.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Alejandro Schuler

Abstract Trials enroll a large number of subjects in order to attain power, making them expensive and time-consuming. Sample size calculations are often performed with the assumption of an unadjusted analysis, even if the trial analysis plan specifies a more efficient estimator (e.g. ANCOVA). This leads to conservative estimates of required sample sizes and an opportunity for savings. Here we show that a relatively simple formula can be used to estimate the power of any two-arm, single-timepoint trial analyzed with a semiparametric efficient estimator, regardless of the domain of the outcome or kind of treatment effect (e.g. odds ratio, mean difference). Since an efficient estimator attains the minimum possible asymptotic variance, this allows for the design of trials that are as small as possible while still attaining design power and control of type I error. The required sample size calculation is parsimonious and requires the analyst to provide only a small number of population parameters. We verify in simulation that the large-sample properties of trials designed this way attain their nominal values. Lastly, we demonstrate how to use this formula in the “design” (and subsequent reanalysis) of a real randomized trial and show that fewer subjects are required to attain the same design power when a semiparametric efficient estimator is accounted for at the design stage.


Author(s):  
Hyun Kang

Appropriate sample size calculation and power analysis have become major issues in research and publication processes. However, the complexity and difficulty of calculating sample size and power require broad statistical knowledge, there is a shortage of personnel with programming skills, and commercial programs are often too expensive to use in practice. The review article aimed to explain the basic concepts of sample size calculation and power analysis; the process of sample estimation; and how to calculate sample size using G*Power software (latest ver. 3.1.9.7; Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany) with 5 statistical examples. The null and alternative hypothesis, effect size, power, alpha, type I error, and type II error should be described when calculating the sample size or power. G*Power is recommended for sample size and power calculations for various statistical methods (F, t, χ2, Z, and exact tests), because it is easy to use and free. The process of sample estimation consists of establishing research goals and hypotheses, choosing appropriate statistical tests, choosing one of 5 possible power analysis methods, inputting the required variables for analysis, and selecting the “Calculate” button. The G*Power software supports sample size and power calculation for various statistical methods (F, t, χ2, z, and exact tests). This software is helpful for researchers to estimate the sample size and to conduct power analysis.


Sign in / Sign up

Export Citation Format

Share Document