scholarly journals Randomized test-treatment studies with an outlook on adaptive designs

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Amra Hot ◽  
Patrick M. Bossuyt ◽  
Oke Gerke ◽  
Simone Wahl ◽  
Werner Vach ◽  
...  

Abstract Background Diagnostic accuracy studies aim to examine the diagnostic accuracy of a new experimental test, but do not address the actual merit of the resulting diagnostic information to a patient in clinical practice. In order to assess the impact of diagnostic information on subsequent treatment strategies regarding patient-relevant outcomes, randomized test-treatment studies were introduced. Various designs for randomized test-treatment studies, including an evaluation of biomarkers as part of randomized biomarker-guided treatment studies, are suggested in the literature, but the nomenclature is not consistent. Methods The aim was to provide a clear description of the different study designs within a pre-specified framework, considering their underlying assumptions, advantages as well as limitations and derivation of effect sizes required for sample size calculations. Furthermore, an outlook on adaptive designs within randomized test-treatment studies is given. Results The need to integrate adaptive design procedures in randomized test-treatment studies is apparent. The derivation of effect sizes induces that sample size calculation will always be based on rather vague assumptions resulting in over- or underpowered study results. Therefore, it might be advantageous to conduct a sample size re-estimation based on a nuisance parameter during the ongoing trial. Conclusions Due to their increased complexity, compared to common treatment trials, the implementation of randomized test-treatment studies poses practical challenges including a huge uncertainty regarding study parameters like the expected outcome in specific subgroups or disease prevalence which might affect the sample size calculation. Since research on adaptive designs within randomized test-treatment studies is limited so far, further research is recommended.

Author(s):  
Mohammad Azfar Qureshi ◽  
Jessica K Paulus ◽  
Felipe Fregni

In this chapter the basic principles of sample size calculation are discussed. The chapter also reviews the impact of sample size calculation on the study results, the parameters needed, and ways this calculation can be performed by researchers. Over- and underestimation of sample size for any study can have significant effects for the study participants, thus ensuring its adequacy is of critical importance. Setting values for alpha (level of significance) and beta (power) should be informed by the specific research goals and study hypothesis. A priori effect size estimation is challenging and can be done in various ways, which are addressed in this chapter. The chapter concludes with examples and references of sources that can be used for sample size calculation.


2012 ◽  
Vol 23 (5) ◽  
pp. 570-574 ◽  
Author(s):  
Nina Musurlieva ◽  
Maria Stoykova ◽  
Doychin Boyadjiev

The aim of the paper is to present the validation of a scale for assessing the impact of periodontal diseases on individuals' quality of life in Bulgaria. A pilot research was made among 30 diagnosed patients with periodontitis visiting the Department of Periodontology, Faculty of Dental Medicine, Medical University of Plovdiv, Bulgaria. The minimum sample size of 30 people was established based on a power analysis for sample size calculation. The mean age of participants was 48.95 ± 11.85 years, being 56.67 ± 9.05 years for males and 43.33 ± 9.05 years for females. Standard interviews were conducted using a specific instrument: self-designed questionnaire and a 5-degree ranked scale, containing initially 11 questions. The interviews were repeated after 3 months with the same patients for retest analysis. The data was statistically processed using SPSS v.13 software. Results received after the initial interviews: Cronbach's coefficient (α=0.882), Spearman-Brown coefficient (r sb=0.998), average inter-item correlation coefficient (R=0.426), difficulty of the questions from 0.173 to 0.757 and discrimination power from 0.405 to 0.809. Results after the second interviews: α=0.883, r sb=0.998, R=0.507, difficulty from 0.287 to 0.757 and discrimination power from 0.524 to 0.809. In two of the questions, a low level of inter-item correlation with the rest of the items was found and they were excluded. The final version of the questionnaire contained 9 questions. The validation proved that the developed scale is sufficiently reliable and will be used in the final research, the first one to use such an instrument for measuring oral health-related quality of life in Bulgaria.


2020 ◽  
Author(s):  
Michael W. Beets ◽  
R. Glenn Weaver ◽  
John P.A. Ioannidis ◽  
Alexis Jones ◽  
Lauren von Klinggraeff ◽  
...  

Abstract Background: Pilot/feasibility or studies with small sample sizes may be associated with inflated effects. This study explores the vibration of effect sizes (VoE) in meta-analyses when considering different inclusion criteria based upon sample size or pilot/feasibility status. Methods: Searches were conducted for meta-analyses of behavioral interventions on topics related to the prevention/treatment of childhood obesity from 01-2016 to 10-2019. The computed summary effect sizes (ES) were extracted from each meta-analysis. Individual studies included in the meta-analyses were classified into one of the following four categories: self-identified pilot/feasibility studies or based upon sample size (N≤100, N>100, and N>370 the upper 75th of sample size). The VoE was defined as the absolute difference (ABS) between the re-estimations of summary ES restricted to study classifications compared to the originally reported summary ES. Concordance (kappa) of statistical significance between summary ES was assessed. Fixed and random effects models and meta-regressions were estimated. Three case studies are presented to illustrate the impact of including pilot/feasibility and N≤100 studies on the estimated summary ES.Results: A total of 1,602 effect sizes, representing 145 reported summary ES, were extracted from 48 meta-analyses containing 603 unique studies (avg. 22 avg. meta-analysis, range 2-108) and included 227,217 participants. Pilot/feasibility and N≤100 studies comprised 22% (0-58%) and 21% (0-83%) of studies. Meta-regression indicated the ABS between the re-estimated and original summary ES where summary ES were comprised of ≥40% of N≤100 studies was 0.29. The ABS ES was 0.46 when summary ES comprised of >80% of both pilot/feasibility and N≤100 studies. Where ≤40% of the studies comprising a summary ES had N>370, the ABS ES ranged from 0.20-0.30. Concordance was low when removing both pilot/feasibility and N≤100 studies (kappa=0.53) and restricting analyses only to the largest studies (N>370, kappa=0.35), with 20% and 26% of the originally reported statistically significant ES rendered non-significant. Reanalysis of the three case study meta-analyses resulted in the re-estimated ES rendered either non-significant or half of the originally reported ES. Conclusions: When meta-analyses of behavioral interventions include a substantial proportion of both pilot/feasibility and N≤100 studies, summary ES can be affected markedly and should be interpreted with caution.


2019 ◽  
Vol 26 (01) ◽  
Author(s):  
Shakeel Ahmad ◽  
Muhammad Nazim ◽  
Rizwan Munir ◽  
Hafiz Muhammad Faiq Ilyas ◽  
Naeem Asghar ◽  
...  

Objectives: To assess the impact of myocardial infarction on quality of life in four year survivors and to determine factors associated with a poor quality of life. Design: Descriptive study. Settings: Faisalabad institute of cardiology Faisalabad. Duration of Study: 1st November 2017 to 30 April 2018. Sample Size: Sample size was 200 as calculated by WHO sample size calculator. Sampling Technique: Non probability consecutive sampling. Subjects: All patients diagnosed with acute myocardial infarction during 2013 and alive at a median of four years. Patients and Methods: 200 patients presenting in outdoor for routine follow up checkup who got MI approximately four years ago in year 2013 were included in the study. Results: 200 patients with an acute myocardial infarction in 2013 and alive and capable of responding to a questionnaire in 2018 were included in the study. Physical functioning was normal in 63%, fair in 25% and disturbed in 12% of patients. Social life functioning was normal in 66%, fair in 26% and disturbed in 8% of patients. No Angina episodes in 61.5%, 1 to 2 angina episodes per month in 25% and more than 3 episodes per month in 13.5% patients. 59% of patients were doing routine jobs, 21.5 % were doing off and on job and 19.5% were not doing any job after MI. Conclusions: this study provides valuable information for the practicing clinicians. Impaired quality of life was reported by patients, unfit for work, those with angina and dyspnea, patients with coexistent lung disease, those with anxiety and sleep disturbances and other co-morbid conditions. Improving quality of life after MI remains a challenge for practicing physicians.


2018 ◽  
Vol 7 (6) ◽  
pp. 81
Author(s):  
Fang Fang ◽  
Yong Lin ◽  
Weichung Joe Shih ◽  
Shou-En Lu ◽  
Guangrui Zhu

The accuracy of the treatment effect estimation is crucial to the success of Phase 3 studies. The calculation of sample size relies on the treatment effect estimation and cannot be changed during the trial in a fixed sample size design. Oftentimes, with limited efficacy data available from early phase studies and relevant historical studies, the sample size calculation may not accurately reflect the true treatment effect. Several adaptive designs have been proposed to address this uncertainty in the sample size calculation. These adaptive designs provide flexibility of sample size adjustment during the trial by allowing early trial stopping or sample size adjustment at interim look(s). The use of adaptive designs can optimize the trial performance when the treatment effect is an assumed constant value. However in practice, it may be more reasonable to consider the treatment effect within an interval rather than as a point estimate. Because proper selection of adaptive designs may decrease the failure rate of Phase 3 clinical trials and increase the chance for new drug approval, this paper proposes measures and evaluates the performance of different adaptive designs based on treatment effect intervals, and identifies factors that may affect the performance of adaptive designs.


2017 ◽  
Author(s):  
Clarissa F. D. Carneiro ◽  
Thiago C. Moulin ◽  
Malcolm R. Macleod ◽  
Olavo B. Amaral

AbstractProposals to increase research reproducibility frequently call for focusing on effect sizes instead of p values, as well as for increasing the statistical power of experiments. However, it is unclear to what extent these two concepts are indeed taken into account in basic biomedical science. To study this in a real-case scenario, we performed a systematic review of effect sizes and statistical power in studies on learning of rodent fear conditioning, a widely used behavioral task to evaluate memory. Our search criteria yielded 410 experiments comparing control and treated groups in 122 articles. Interventions had a mean effect size of 29.5%, and amnesia caused by memory-impairing interventions was nearly always partial. Mean statistical power to detect the average effect size observed in well-powered experiments with significant differences (37.2%) was 65%, and was lower among studies with non-significant results. Only one article reported a sample size calculation, and our estimated sample size to achieve 80% power considering typical effect sizes and variances (15 animals per group) was reached in only 12.2% of experiments. Actual effect sizes correlated with effect size inferences made by readers on the basis of textual descriptions of results only when findings were non-significant, and neither effect size nor power correlated with study quality indicators, number of citations or impact factor of the publishing journal. In summary, effect sizes and statistical power have a wide distribution in the rodent fear conditioning literature, but do not seem to have a large influence on how results are described or cited. Failure to take these concepts into consideration might limit attempts to improve reproducibility in this field of science.


2007 ◽  
Vol 46 (06) ◽  
pp. 655-661 ◽  
Author(s):  
H. Heinzl ◽  
A. Benner ◽  
C. Ittrich ◽  
M. Mittlböck

Summary Objectives : Numerous sample size calculation programs are available nowadays. They include both commercial products as well as public domain and open source applications. We propose modifications for these programs in order to even better support statistical consultation during the planning stage of a two-armed clinical trial. Methods : Directional two-sided tests are commonly used for two-armed clinical trials. This may lead to a non-negligible Type III error risk in a severely underpowered study. In the case of a reasonably sized study the question for the so-called auxiliary alternative may evolve. Results : We propose that sample size calculation programs should be able to compute i) Type III errors and the so-called (q-values, ii) minimum sample sizes required to keep the (q-values below pre-specified levels, and iii) detectable effect sizes of the so-called auxiliary alternatives. Conclusions : Proposals iand ii are intended to help prevent irresponsibly underpowered clinical trials, whereas the proposal iii is meant as additional assistance for the planning of reasonably sized clinical trials.


2005 ◽  
Vol 47 (2) ◽  
pp. 140-150 ◽  
Author(s):  
Aiyi Liu ◽  
Enrique F. Schisterman ◽  
Madhu Mazumdar ◽  
Jiang Hu

2020 ◽  
Vol 29 (10) ◽  
pp. 2958-2971 ◽  
Author(s):  
Maria Stark ◽  
Antonia Zapf

Introduction In a confirmatory diagnostic accuracy study, sensitivity and specificity are considered as co-primary endpoints. For the sample size calculation, the prevalence of the target population must be taken into account to obtain a representative sample. In this context, a general problem arises. With a low or high prevalence, the study may be overpowered in one subpopulation. One further issue is the correct pre-specification of the true prevalence. With an incorrect assumption about the prevalence, an over- or underestimated sample size will result. Methods To obtain the desired power independent of the prevalence, a method for an optimal sample size calculation for the comparison of a diagnostic experimental test with a prespecified minimum sensitivity and specificity is proposed. To face the problem of an incorrectly pre-specified prevalence, a blinded one-time re-estimation design of the sample size based on the prevalence and a blinded repeated re-estimation design of the sample size based on the prevalence are evaluated by a simulation study. Both designs are compared to a fixed design and additionally among each other. Results The type I error rates of both blinded re-estimation designs are not inflated. Their empirical overall power equals the desired theoretical power and both designs offer unbiased estimates of the prevalence. The repeated re-estimation design reveals no advantages concerning the mean squared error of the re-estimated prevalence or sample size compared to the one-time re-estimation design. The appropriate size of the internal pilot study in the one-time re-estimation design is 50% of the initially calculated sample size. Conclusions A one-time re-estimation design of the prevalence based on the optimal sample size calculation is recommended in single-arm diagnostic accuracy studies.


Endoscopy ◽  
2020 ◽  
Author(s):  
Mohammad Al-Haddad ◽  
Michael B Wallace ◽  
William Brugge ◽  
Sundeep Lakhtakia ◽  
Zhaoshen Li ◽  
...  

Abstract Background and study aims: Pancreatic cystic lesions (PCLs) are increasingly found on cross-sectional imaging, and a majority have low risk for malignancy. The added value of fine-needle aspiration (FNA) in risk stratification remains unclear. We evaluated the impact of 3 FNA needles on diagnostic accuracy, clinical management, and ability to accrue fluid for tumor markers. Patients and methods: A multicenter prospective trial randomized 250 patients with PCLs≥13mm to 19G Flex(2):19G(1):22G(1) needles with cross-over as repeated FNA procedures. Diagnostic accuracy established at 2-year follow-up with final diagnosis from surgical histopathology or consensus diagnosis by experts based sequentially on clinical presentation, imaging, and aspirate analysis in blinded review. Results: Patients enrolled (36% symptomatic) with PCLs in head (44%), body (28%), and tail (26%). Percentage of cyst volume aspiration is 78% [72%-84%] for 19G Flex, 74% [64%-84%] for 22G, and 73% [63%-83%] for19G (p=.84). Successful FNA was significantly higher for 19G Flex (89% [82%-94%]) and 22G (82% [70%-90%]) compared to 19G (75% [63%-85%]) (p=0.02). Repeated FNA was required more frequently in head/uncinate lesions than body and tail (p<.01). Diagnostic accuracy of cyst aspirate was 84% [73%-91%] against histopathology at 2-year follow-up (n=79), and 77% [70%-83%] against consensus diagnosis among non-resective cases (n=171). Related serious adverse events occurred in 1.2% [0.2%-3.5%] of patients. Conclusions: Our study results demonstrate a statistically significant difference among the three needles in overall success rate of aspiration, but not in percentage of cyst volume aspirated. Flexible needles may be particularly valuable in sampling cystic PCLs in head/uncinate of pancreas.


Sign in / Sign up

Export Citation Format

Share Document