A Comparison of Covariate Adjustment Approaches Under Model Misspecification In Individually Randomized Trials

Mapping Intimacies ◽

10.21203/rs.3.rs-1053600/v1 ◽

2022 ◽

Author(s):

Mia S. Tackney ◽

Tim Morris ◽

Ian White ◽

Clemence Leyrat ◽

Karla Diaz-Ordaz ◽

...

Keyword(s):

Sample Size ◽

Randomized Trials ◽

Model Misspecification ◽

Likelihood Estimation ◽

Small Sample ◽

Analysis Of Covariance ◽

Small Samples ◽

Linear Interaction ◽

Inverse Probability ◽

Omitted Covariate

Abstract Adjustment for baseline covariates in randomized trials has been shown to lead to gains in power and can protect against chance imbalances in covariates. For continuous covariates, there is a risk that the the form of the relationship between the covariate and outcome is misspecified when taking an adjusted approach. Using a simulation study focusing on small to medium-sized individually randomized trials, we explore whether a range of adjustment methods are robust to misspecification, either in the covariate-outcome relationship or through an omitted covariate-treatment interaction. Specifically, we aim to identify potential settings where G-computation, Inverse Probability of Treatment Weighting ( IPTW ), Augmented Inverse Probability of Treatment Weighting ( AIPTW ) and Targeted Maximum Likelihood Estimation ( TMLE ) offer improvement over the commonly used Analysis of Covariance ( ANCOVA ). Our simulations show that all adjustment methods are generally robust to model misspecification if adjusting for a few covariates, sample size is 100 or larger, and there are no covariate-treatment interactions. When there is a non-linear interaction of treatment with a skewed covariate and sample size is small, all adjustment methods can suffer from bias; however, methods that allow for interactions (such as G-computation with interaction and IPTW ) show improved results compared to ANCOVA . When there are a high number of covariates to adjust for, ANCOVA retains good properties while other methods suffer from under- or over-coverage. An outstanding issue for G-computation, IPTW and AIPTW in small samples is that standard errors are underestimated; development of small sample corrections is needed.

W.S. Gosset and Some Neglected Concepts in Experimental Statistics: Guinnessometrics II

Journal of Wine Economics ◽

10.1017/s1931436100001632 ◽

2011 ◽

Vol 6 (2) ◽

pp. 252-277 ◽

Cited By ~ 3

Author(s):

Stephen T. Ziliak

Keyword(s):

Experimental Design ◽

Sample Size ◽

Large Scale ◽

Statistical Significance ◽

Small Sample ◽

Small Samples ◽

Significant Advance ◽

Economic Approach ◽

Barley Malt ◽

Level Of Significance

AbstractStudent's exacting theory of errors, both random and real, marked a significant advance over ambiguous reports of plant life and fermentation asserted by chemists from Priestley and Lavoisier down to Pasteur and Johannsen, working at the Carlsberg Laboratory. One reason seems to be that William Sealy Gosset (1876–1937) aka “Student” – he of Student'st-table and test of statistical significance – rejected artificial rules about sample size, experimental design, and the level of significance, and took instead an economic approach to the logic of decisions made under uncertainty. In his job as Apprentice Brewer, Head Experimental Brewer, and finally Head Brewer of Guinness, Student produced small samples of experimental barley, malt, and hops, seeking guidance for industrial quality control and maximum expected profit at the large scale brewery. In the process Student invented or inspired half of modern statistics. This article draws on original archival evidence, shedding light on several core yet neglected aspects of Student's methods, that is, Guinnessometrics, not discussed by Ronald A. Fisher (1890–1962). The focus is on Student's small sample, economic approach to real error minimization, particularly in field and laboratory experiments he conducted on barley and malt, 1904 to 1937. Balanced designs of experiments, he found, are more efficient than random and have higher power to detect large and real treatment differences in a series of repeated and independent experiments. Student's world-class achievement poses a challenge to every science. Should statistical methods – such as the choice of sample size, experimental design, and level of significance – follow the purpose of the experiment, rather than the other way around? (JEL classification codes: C10, C90, C93, L66)

SMALL SAMPLE SIZE SCIENTIST

PEDIATRICS ◽

10.1542/peds.83.3.a72a ◽

1989 ◽

Vol 83 (3) ◽

pp. A72-A72

Author(s):

Student

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Causal Explanation ◽

Small Sample Size ◽

Small Sample ◽

Small Samples ◽

High Expectations ◽

Sampling Variation ◽

The Law ◽

The Stability

The believer in the law of small numbers practices science as follows: 1. He gambles his research hypotheses on small samples without realizing that the odds against him are unreasonably high. He overestimates power. 2. He has undue confidence in early trends (e.g., the data of the first few subjects) and in the stability of observed patterns (e.g., the number and identity of significant results). He overestimates significance. 3. In evaluating replications, his or others', he has unreasonably high expectations about the replicability of significant results. He underestimates the breadth of confidence intervals. 4. He rarely attributes a deviation of results from expectations to sampling variability, because he finds a causal "explanation" for any discrepancy. Thus, he has little opportunity to recognize sampling variation in action. His belief in the law of small numbers, therefore, will forever remain intact.

Effects of sample size on estimation of rainfall extremes at high temperatures

Natural Hazards and Earth System Science ◽

10.5194/nhess-17-1623-2017 ◽

2017 ◽

Vol 17 (9) ◽

pp. 1623-1629 ◽

Cited By ~ 9

Author(s):

Berry Boessenkool ◽

Gerd Bürger ◽

Maik Heistermann

Keyword(s):

Sample Size ◽

High Temperatures ◽

Generalized Pareto Distribution ◽

Small Sample ◽

Small Samples ◽

Climate Data ◽

Rainfall Frequency ◽

High Quantiles ◽

Limited Moisture ◽

Quantile Estimates

Abstract. High precipitation quantiles tend to rise with temperature, following the so-called Clausius–Clapeyron (CC) scaling. It is often reported that the CC-scaling relation breaks down and even reverts for very high temperatures. In our study, we investigate this reversal using observational climate data from 142 stations across Germany. One of the suggested meteorological explanations for the breakdown is limited moisture supply. Here we argue that, instead, it could simply originate from undersampling. As rainfall frequency generally decreases with higher temperatures, rainfall intensities as dictated by CC scaling are less likely to be recorded than for moderate temperatures. Empirical quantiles are conventionally estimated from order statistics via various forms of plotting position formulas. They have in common that their largest representable return period is given by the sample size. In small samples, high quantiles are underestimated accordingly. The small-sample effect is weaker, or disappears completely, when using parametric quantile estimates from a generalized Pareto distribution (GPD) fitted with L moments. For those, we obtain quantiles of rainfall intensities that continue to rise with temperature.

A simple sample size formula for analysis of covariance in cluster randomized trials

Statistics in Medicine ◽

10.1002/sim.5352 ◽

2012 ◽

Vol 31 (20) ◽

pp. 2169-2178 ◽

Cited By ~ 41

Author(s):

Steven Teerenstra ◽

Sandra Eldridge ◽

Maud Graff ◽

Esther Hoop ◽

George F. Borm

Keyword(s):

Sample Size ◽

Randomized Trials ◽

Analysis Of Covariance ◽

Cluster Randomized Trials ◽

Cluster Randomized

On the Robustness of LISREL (Maximum Likelihood Estimation) Against Small Sample Size and Non-normality.

Journal of the American Statistical Association ◽

10.2307/2288313 ◽

1984 ◽

Vol 79 (386) ◽

pp. 480 ◽

Cited By ~ 2

Author(s):

Robert M. Pruzek ◽

Anne Boomsma

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimation ◽

Sample Size ◽

Small Sample Size ◽

Likelihood Estimation ◽

Small Sample

Improving Precision and Power in Randomized Trials for COVID-19 Treatments Using Covariate Adjustment, for Binary, Ordinal, and Time-to-Event Outcomes

10.1101/2020.04.19.20069922 ◽

2020 ◽

Cited By ~ 2

Author(s):

David Benkeser ◽

Iván Díaz ◽

Alex Luedtke ◽

Jodi Segal ◽

Daniel Scharfstein ◽

...

Keyword(s):

New York ◽

Sample Size ◽

Randomized Trials ◽

Model Misspecification ◽

Value Added ◽

Standard Of Care ◽

R Package ◽

Covariate Adjustment ◽

European Medicines Agency ◽

Time To Event

SummaryTime is of the essence in evaluating potential drugs and biologics for the treatment and prevention of COVID-19. There are currently over 400 clinical trials (phase 2 and 3) of treatments for COVID-19 registered on clinicaltrials.gov. Covariate adjustment is a statistical analysis method with potential to improve precision and reduce the required sample size for a substantial number of these trials. Though covariate adjustment is recommended by the U.S. Food and Drug Administration and the European Medicines Agency, it is underutilized, especially for the types of outcomes (binary, ordinal and time-to-event) that are common in COVID-19 trials. To demonstrate the potential value added by covariate adjustment in this context, we simulated two-arm, randomized trials comparing a hypothetical COVID-19 treatment versus standard of care, where the primary outcome is binary, ordinal, or time-to-event. Our simulated distributions are derived from two sources: longitudinal data on over 500 patients hospitalized at Weill Cornell Medicine New York Presbyterian Hospital, and a Centers for Disease Control and Prevention (CDC) preliminary description of 2449 cases. We found substantial precision gains from using covariate adjustment-equivalent to 9-21% reductions in the required sample size to achieve a desired power-for a variety of estimands (targets of inference) when the trial sample size was at least 200. We provide an R package and practical recommendations for implementing covariate adjustment. The estimators that we consider are robust to model misspecification.

Test Method with Small Samples of Electro-Explosive Devices Based on Information Equivalence Principle

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.65.291 ◽

2011 ◽

Vol 65 ◽

pp. 291-294

Author(s):

Yao Hua Wang ◽

Liang Wang ◽

Hai Shan Yang ◽

Bao Guo Zhu

Keyword(s):

Sample Size ◽

Equivalence Principle ◽

Small Sample ◽

Test Method ◽

Small Samples ◽

Reliability Test ◽

Ignition Probability ◽

Sample Test ◽

Definition Of Information ◽

Explosive Devices

In order to solve the problem which generally exists in assessing high explosive ignition reliability of electro-explosive devices (EED), a new test method, based on information equivalence principle, is proposed on the condition of a relatively smaller sample size for instead. According to the definition of information principle, the method measures the reliability test information by the negative logarithm of ignition probability of EED and converts the test by GJB376-1987 at a larger amount of stimulation with a big sample size to a small one. We adopt this method to assess the ignition reliability of EED used in the emergency opening system. The result is that we just need 29 sample size on the confidence of not less than 95% and the ignition reliability greater than 0.999. Compared with the 2996 sample size in GJB376-1987, the method reduces the sample usage greatly. Tests shows that the small sample test method based on information equivalence principle for the ignition reliability test of EED is accurate, feasible and can meet the objective of experimental design

Statistics for Small Groups: The Power of the Pretest

Journal of the Association for Persons with Severe Handicaps ◽

10.1177/154079698801300303 ◽

1988 ◽

Vol 13 (3) ◽

pp. 142-146 ◽

Cited By ~ 6

Author(s):

David A. Cole

Keyword(s):

Statistical Power ◽

Change Score ◽

Small Sample ◽

Analysis Of Covariance ◽

Small Samples ◽

Group Differences ◽

Control Group ◽

Small Sample Sizes ◽

Almost All ◽

Treatment Control Group

In the area of severe-profound retardation, researchers are faced with small sample sizes. The question of statistical power is critical. In this article, three commonly used tests for treatment-control group differences are compared with respect to their relative power: the posttest-only approach, the change-score approach, and an analysis of covariance (ANCOVA) approach. In almost all cases, the ANCOVA approach is the more powerful than the other two, even when very small samples are involved. Finally, a fourth approach involving ANCOVA plus alternate rank assignments is examined and found to be superior even to the ANCOVA approach, especially in small sample cases. Use of slightly more sophisticated statistics in small sample research is recommended.

Taking population stratification into account by local permutations in rare-variant association studies on small samples

10.1101/2020.01.29.924977 ◽

2020 ◽

Cited By ~ 1

Author(s):

J. Mullaert ◽

M. Bouaziz ◽

Y. Seeleuthner ◽

B. Bigio ◽

J-L. Casanova ◽

...

Keyword(s):

Sample Size ◽

Rare Variant ◽

Population Stratification ◽

Type I Error ◽

Small Sample Size ◽

Association Studies ◽

Small Sample ◽

Small Samples ◽

Type I ◽

Rare Variant Association

AbstractMany methods for rare variant association studies require permutations to assess the significance of tests. Standard permutations assume that all individuals are exchangeable and do not take population stratification (PS), a known confounding factor in genetic studies, into account. We propose a novel strategy, LocPerm, in which individuals are permuted only with their closest ancestry-based neighbors. We performed a simulation study, focusing on small samples, to evaluate and compare LocPerm with standard permutations and classical adjustment on first principal components. Under the null hypothesis, LocPerm was the only method providing an acceptable type I error, regardless of sample size and level of stratification. The power of LocPerm was similar to that of standard permutation in the absence of PS, and remained stable in different PS scenarios. We conclude that LocPerm is a method of choice for taking PS and/or small sample size into account in rare variant association studies.

A Machine Learning Approach to Assess Differential Item Functioning in Psychometric Questionnaires Using the Elastic Net Regularized Ordinal Logistic Regression in Small Sample Size Groups

BioMed Research International ◽

10.1155/2021/6854477 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Vahid Ebrahimi ◽

Zahra Bagheri ◽

Zahra Shayan ◽

Peyman Jafari

Keyword(s):

Sample Size ◽

Type I Error ◽

Small Sample Size ◽

Error Rates ◽

Small Sample ◽

Elastic Net ◽

Small Samples ◽

Type I ◽

Item Functioning ◽

Size Groups

Assessing differential item functioning (DIF) using the ordinal logistic regression (OLR) model highly depends on the asymptotic sampling distribution of the maximum likelihood (ML) estimators. The ML estimation method, which is often used to estimate the parameters of the OLR model for DIF detection, may be substantially biased with small samples. This study is aimed at proposing a new application of the elastic net regularized OLR model, as a special type of machine learning method, for assessing DIF between two groups with small samples. Accordingly, a simulation study was conducted to compare the powers and type I error rates of the regularized and nonregularized OLR models in detecting DIF under various conditions including moderate and severe magnitudes of DIF ( DIF = 0.4 and 0.8 ), sample size ( N ), sample size ratio ( R ), scale length ( I ), and weighting parameter ( w ). The simulation results revealed that for I = 5 and regardless of R , the elastic net regularized OLR model with w = 0.1 , as compared with the nonregularized OLR model, increased the power of detecting moderate uniform DIF ( DIF = 0.4 ) approximately 35% and 21% for N = 100 and 150 , respectively. Moreover, for I = 10 and severe uniform DIF ( DIF = 0.8 ), the average power of the elastic net regularized OLR model with 0.03 ≤ w ≤ 0.06 , as compared with the nonregularized OLR model, increased approximately 29.3% and 11.2% for N = 100 and 150 , respectively. In these cases, the type I error rates of the regularized and nonregularized OLR models were below or close to the nominal level of 0.05. In general, this simulation study showed that the elastic net regularized OLR model outperformed the nonregularized OLR model especially in extremely small sample size groups. Furthermore, the present research provided a guideline and some recommendations for researchers who conduct DIF studies with small sample sizes.