Sample size and optimal design for logistic regression with binary interaction

Background: Bayesian response-adaptive designs, which data adaptively alter the allocation ratio in favor of the better performing treatment, are often criticized for engendering a non-trivial probability of a subject imbalance in favor of the inferior treatment, inflating type I error rate, and increasing sample size requirements. The implementation of these designs using the Thompson sampling methods has generally assumed a simple beta-binomial probability model in the literature; however, the effect of these choices on the resulting design operating characteristics relative to other reasonable alternatives has not been fully examined. Motivated by the Advanced R2 Eperfusion STrategies for Refractory Cardiac Arrest trial, we posit that a logistic probability model coupled with an urn or permuted block randomization method will alleviate some of the practical limitations engendered by the conventional implementation of a two-arm Bayesian response-adaptive design with binary outcomes. In this article, we discuss up to what extent this solution works and when it does not. Methods: A computer simulation study was performed to evaluate the relative merits of a Bayesian response-adaptive design for the Advanced R2 Eperfusion STrategies for Refractory Cardiac Arrest trial using the Thompson sampling methods based on a logistic regression probability model coupled with either an urn or permuted block randomization method that limits deviations from the evolving target allocation ratio. The different implementations of the response-adaptive design were evaluated for type I error rate control across various null response rates and power, among other performance metrics. Results: The logistic regression probability model engenders smaller average sample sizes with similar power, better control over type I error rate, and more favorable treatment arm sample size distributions than the conventional beta-binomial probability model, and designs using the alternative randomization methods have a negligible chance of a sample size imbalance in the wrong direction. Conclusion: Pairing the logistic regression probability model with either of the alternative randomization methods results in a much improved response-adaptive design in regard to important operating characteristics, including type I error rate control and the risk of a sample size imbalance in favor of the inferior treatment.

Download Full-text

Power and sample size for multivariate logistic modeling of unmatched case-control studies

Statistical Methods in Medical Research ◽

10.1177/0962280217737157 ◽

2017 ◽

Vol 28 (3) ◽

pp. 822-834

Author(s):

Mitchell H Gail ◽

Sebastien Haneuse

Keyword(s):

Logistic Regression ◽

Sample Size ◽

Case Control ◽

Simulation Methods ◽

Case Control Studies ◽

Control Data ◽

Logistic Analysis ◽

Sample Size Calculations ◽

Control Designs ◽

Univariate Analyses

Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we outline the theory needed to detect scalar exposure effects or scalar interactions while controlling for other covariates in logistic regression. Both analytical and simulation methods are presented, together with links to the corresponding software.

Download Full-text

A Comparison Study of Goodness of Fit Tests of Logistic Regression in R: Simulation and Application to Breast Cancer Data

Academic Journal of Applied Mathematical Sciences ◽

10.32861/ajams.71.50.59 ◽

2020 ◽

pp. 50-59

Author(s):

El-Housainy A. Rady ◽

Mohamed R. Abonazel ◽

Mariam H. Metawe’e

Keyword(s):

Breast Cancer ◽

Logistic Regression ◽

Sample Size ◽

Null Hypothesis ◽

Goodness Of Fit ◽

Quadratic Term ◽

Breast Cancer Dataset ◽

Cancer Data ◽

Interaction Term ◽

Test Package

Goodness of fit (GOF) tests of logistic regression attempt to find out the suitability of the model to the data. The null hypothesis of all GOF tests is the model fit. R as a free software package has many GOF tests in different packages. A Monte Carlo simulation has been conducted to study two situations; the first, studying the ability of each test, under its default settings, to accept the null hypothesis when the model truly fitted. The second, studying the power of these tests when assumptions of sufficient linear combination of the explanatory variables are violated (by omitting linear covariate term, quadratic term, or interaction term). Moreover, checking whether the same test in different R packages had the same results or not. As the sample size supposed to affect simulation results, so the pattern of change of GOF tests results under different sample sizes as well as different model settings was estimated. All tests accept the null hypothesis (more than 95% of simulation trials) when the model truly fitted except modified Hosmer-Lemeshow test in "LogisticDx" package under all different model settings and Osius and Rojek’s (OsRo) test when the true model had an interaction term between binary and categorical covariates. In addition, le Cessie-van Houwelingen-Copas-Hosmer unweighted sum of squares (CHCH) test gave unexpected different results under different packages. Concerning the power study, all tests had a very low power when a departure of missing covariate existed. Generally, stukel’s test (package ’LogisticDX) and CHCH test (package "RMS") reached a power in detecting a missing quadratic term greater than 80% under lower sample size while OsRo test (package ’LogisticDX’) was better in detecting missing interaction term. Beside the simulation study, we evaluated the performance of GOF tests using the breast cancer dataset.

Download Full-text

Sample Size Determination and Optimal Design of Simple Pretest-Posttest Experimental Designs: Introduction, Software, and Illustrations

10.35542/osf.io/k5ey8 ◽

2021 ◽

Author(s):

Metin Bulus

Keyword(s):

Optimal Design ◽

Sample Size ◽

Small Sample ◽

Experimental Designs ◽

Sample Size Determination ◽

Size Determination ◽

Sample Sizes ◽

Control Groups ◽

Small Sample Sizes ◽

And Control

A recent systematic review of experimental studies conducted in Turkey between 2010 and 2020 reported that small sample sizes had been a significant drawback (Bulus and Koyuncu, 2021). A small chunk of the studies were small-scale true experiments (subjects randomized into the treatment and control groups). The remaining studies consisted of quasi-experiments (subjects in treatment and control groups were matched on pretest or other covariates) and weak experiments (neither randomized nor matched but had the control group). They had an average sample size below 70 for different domains and outcomes. These small sample sizes imply a strong (and perhaps erroneous) assumption about the minimum relevant effect size (MRES) of intervention before an experiment is conducted; that is, a standardized intervention effect of Cohen’s d < 0.50 is not relevant to education policy or practice. Thus, an introduction to sample size determination for pretest-posttest simple experimental designs is warranted. This study describes nuts and bolts of sample size determination, derives expressions for optimal design under differential cost per treatment and control units, provide convenient tables to guide sample size decisions for MRES values between 0.20 ≤ Cohen’s d ≤ 0.50, and describe the relevant software along with illustrations.

Download Full-text

Optimal design with variable cost and precision requirements

Canadian Journal of Forest Research ◽

10.1139/x89-241 ◽

1989 ◽

Vol 19 (12) ◽

pp. 1591-1597

Author(s):

Margaret Penner

Keyword(s):

Biological Sciences ◽

Optimal Design ◽

Sample Size ◽

Design Theory ◽

Sampling Design ◽

Unit Cost ◽

Evaluation Process ◽

Variable Cost ◽

Design Flexibility ◽

The Cost

A method for incorporating variable costs and differing precision requirements into optimal design theory is developed and discussed. In many studies and experiments, particularly in the biological sciences, the cost of each observation can vary considerably depending on the attributes of the sample. Ignoring observation costs leads to designs that maximize precision for a given sample size. However, by incorporating costs, efficiency is maximized by optimizing precision per unit cost. An example is presented that demonstrates the efficiency of a weighted optimal design in comparison with several alternatives. The weighted optimal design is most efficient at meeting the experimenter's precision objectives. Comparing designs allows the introduction of additional criteria such as design flexibility into the evaluation process. Explicitly incorporating both cost and precision in the search for a sampling design ensures time is wisely spent considering study objectives, including precision requirements.

Download Full-text

A Modification of Simon's Optimal Design for Phase II Trials When the Criterion Is Median Sample Size

Controlled Clinical Trials ◽

10.1016/s0197-2456(99)00028-8 ◽

1999 ◽

Vol 20 (6) ◽

pp. 555-566 ◽

Cited By ~ 17

Author(s):

John J. Hanfelt ◽

Rebecca S. Slack ◽

Edmund A. Gehan

Keyword(s):

Optimal Design ◽

Sample Size ◽

Phase Ii ◽

Phase Ii Trials

Download Full-text

Sample Size for Logistic Regression with Small Response Probability

Journal of the American Statistical Association ◽

10.1080/01621459.1981.10477597 ◽

1981 ◽

Vol 76 (373) ◽

pp. 27-32 ◽

Cited By ~ 73

Author(s):

Alice S. Whittemore

Keyword(s):

Logistic Regression ◽

Sample Size ◽

Response Probability

Download Full-text

Smoking and substance abuse prevalence in adolescents in a city of Turkey

European Journal of Public Health ◽

10.1093/eurpub/ckz186.124 ◽

2019 ◽

Vol 29 (Supplement_4) ◽

Author(s):

B Mete ◽

E Pehlivan ◽

V Söyiler

Keyword(s):

Substance Abuse ◽

Substance Use ◽

Logistic Regression ◽

High School Students ◽

Sample Size ◽

Female Students ◽

Male Students ◽

Size Analysis ◽

Addictive Substance ◽

Abuse Risk

Abstract Background The aim of this study was to determine the prevalence of smoking and abuse of substance among young people aged 14-18 in a city of Turkey and to determine the relationship between smoking and substance abuse risk. Methods This cross-sectional study was conducted on high school students studying in Bingöl city center. The universe of the study consists of 14000 students studying in 14 high schools. The minimum sample size required to be reached in the sample size analysis with reference to 80% power and 99% confidence interval was found to be 1235. According to the stratified sampling method, the students were randomly reached in schools and questionnaires were conducted under supervision by taking their consent. Chi-square test, Binary Logistic Regression test were used for data analysis. Results The mean age of the students was 15.71 ± 1.16 (min-max: 14-18) and 49.5% were male. The prevalence of smoking among all students is 15.8%, addictive substance use / trial frequency 5% except smoking. The prevalence of smoking among male students is 24.1%, in female students 7.7%. The rate of using addictive substance was found to be 8.2% for male students and 1.9% for female students except smoking. According to the results of Logistic Regression; substance abuse increases 8 (95% CI:3,32-19,95) fold in smokers (p = 0,001) and 2.5 (95% CI:1,10-5,38) fold in men (p = 0,027). The risk of substance use increases 1.05 (95% CI:1,02-1,08) fold as the number of cigarettes smoked daily (p = 0,001). Substance abuse risk of 18-year-olds shows increase 1.5 (95% CI:1,06-1,93) fold according to 14 years old (p = 0,021). Conclusions Smoking and addictive substance use in adolescents are particularly remarkable in male students (8.2%). This result is higher than the data reflecting Ä°stanbul (7%). This may be due to the fact that the province is located at the crossing point of drug traffic. Smoking increases the risk of other addictive substances (marijuana, heroin, etc.). Key messages Smoking and substance abuse is an important health problem in adolescents according to this study. Male students smoke are at risk of substance abuse more than female.

Download Full-text