scholarly journals A Comparison of Covariate Adjustment Approaches Under Model Misspecification In Individually Randomized Trials

Author(s):  
Mia S. Tackney ◽  
Tim Morris ◽  
Ian White ◽  
Clemence Leyrat ◽  
Karla Diaz-Ordaz ◽  
...  

Abstract Adjustment for baseline covariates in randomized trials has been shown to lead to gains in power and can protect against chance imbalances in covariates. For continuous covariates, there is a risk that the the form of the relationship between the covariate and outcome is misspecified when taking an adjusted approach. Using a simulation study focusing on small to medium-sized individually randomized trials, we explore whether a range of adjustment methods are robust to misspecification, either in the covariate-outcome relationship or through an omitted covariate-treatment interaction. Specifically, we aim to identify potential settings where G-computation, Inverse Probability of Treatment Weighting ( IPTW ), Augmented Inverse Probability of Treatment Weighting ( AIPTW ) and Targeted Maximum Likelihood Estimation ( TMLE ) offer improvement over the commonly used Analysis of Covariance ( ANCOVA ). Our simulations show that all adjustment methods are generally robust to model misspecification if adjusting for a few covariates, sample size is 100 or larger, and there are no covariate-treatment interactions. When there is a non-linear interaction of treatment with a skewed covariate and sample size is small, all adjustment methods can suffer from bias; however, methods that allow for interactions (such as G-computation with interaction and IPTW ) show improved results compared to ANCOVA . When there are a high number of covariates to adjust for, ANCOVA retains good properties while other methods suffer from under- or over-coverage. An outstanding issue for G-computation, IPTW and AIPTW in small samples is that standard errors are underestimated; development of small sample corrections is needed.

2011 ◽  
Vol 6 (2) ◽  
pp. 252-277 ◽  
Author(s):  
Stephen T. Ziliak

AbstractStudent's exacting theory of errors, both random and real, marked a significant advance over ambiguous reports of plant life and fermentation asserted by chemists from Priestley and Lavoisier down to Pasteur and Johannsen, working at the Carlsberg Laboratory. One reason seems to be that William Sealy Gosset (1876–1937) aka “Student” – he of Student'st-table and test of statistical significance – rejected artificial rules about sample size, experimental design, and the level of significance, and took instead an economic approach to the logic of decisions made under uncertainty. In his job as Apprentice Brewer, Head Experimental Brewer, and finally Head Brewer of Guinness, Student produced small samples of experimental barley, malt, and hops, seeking guidance for industrial quality control and maximum expected profit at the large scale brewery. In the process Student invented or inspired half of modern statistics. This article draws on original archival evidence, shedding light on several core yet neglected aspects of Student's methods, that is, Guinnessometrics, not discussed by Ronald A. Fisher (1890–1962). The focus is on Student's small sample, economic approach to real error minimization, particularly in field and laboratory experiments he conducted on barley and malt, 1904 to 1937. Balanced designs of experiments, he found, are more efficient than random and have higher power to detect large and real treatment differences in a series of repeated and independent experiments. Student's world-class achievement poses a challenge to every science. Should statistical methods – such as the choice of sample size, experimental design, and level of significance – follow the purpose of the experiment, rather than the other way around? (JEL classification codes: C10, C90, C93, L66)


PEDIATRICS ◽  
1989 ◽  
Vol 83 (3) ◽  
pp. A72-A72
Author(s):  
Student

The believer in the law of small numbers practices science as follows: 1. He gambles his research hypotheses on small samples without realizing that the odds against him are unreasonably high. He overestimates power. 2. He has undue confidence in early trends (e.g., the data of the first few subjects) and in the stability of observed patterns (e.g., the number and identity of significant results). He overestimates significance. 3. In evaluating replications, his or others', he has unreasonably high expectations about the replicability of significant results. He underestimates the breadth of confidence intervals. 4. He rarely attributes a deviation of results from expectations to sampling variability, because he finds a causal "explanation" for any discrepancy. Thus, he has little opportunity to recognize sampling variation in action. His belief in the law of small numbers, therefore, will forever remain intact.


2017 ◽  
Vol 17 (9) ◽  
pp. 1623-1629 ◽  
Author(s):  
Berry Boessenkool ◽  
Gerd Bürger ◽  
Maik Heistermann

Abstract. High precipitation quantiles tend to rise with temperature, following the so-called Clausius–Clapeyron (CC) scaling. It is often reported that the CC-scaling relation breaks down and even reverts for very high temperatures. In our study, we investigate this reversal using observational climate data from 142 stations across Germany. One of the suggested meteorological explanations for the breakdown is limited moisture supply. Here we argue that, instead, it could simply originate from undersampling. As rainfall frequency generally decreases with higher temperatures, rainfall intensities as dictated by CC scaling are less likely to be recorded than for moderate temperatures. Empirical quantiles are conventionally estimated from order statistics via various forms of plotting position formulas. They have in common that their largest representable return period is given by the sample size. In small samples, high quantiles are underestimated accordingly. The small-sample effect is weaker, or disappears completely, when using parametric quantile estimates from a generalized Pareto distribution (GPD) fitted with L moments. For those, we obtain quantiles of rainfall intensities that continue to rise with temperature.


2012 ◽  
Vol 31 (20) ◽  
pp. 2169-2178 ◽  
Author(s):  
Steven Teerenstra ◽  
Sandra Eldridge ◽  
Maud Graff ◽  
Esther Hoop ◽  
George F. Borm

Author(s):  
David Benkeser ◽  
Iván Díaz ◽  
Alex Luedtke ◽  
Jodi Segal ◽  
Daniel Scharfstein ◽  
...  

SummaryTime is of the essence in evaluating potential drugs and biologics for the treatment and prevention of COVID-19. There are currently over 400 clinical trials (phase 2 and 3) of treatments for COVID-19 registered on clinicaltrials.gov. Covariate adjustment is a statistical analysis method with potential to improve precision and reduce the required sample size for a substantial number of these trials. Though covariate adjustment is recommended by the U.S. Food and Drug Administration and the European Medicines Agency, it is underutilized, especially for the types of outcomes (binary, ordinal and time-to-event) that are common in COVID-19 trials. To demonstrate the potential value added by covariate adjustment in this context, we simulated two-arm, randomized trials comparing a hypothetical COVID-19 treatment versus standard of care, where the primary outcome is binary, ordinal, or time-to-event. Our simulated distributions are derived from two sources: longitudinal data on over 500 patients hospitalized at Weill Cornell Medicine New York Presbyterian Hospital, and a Centers for Disease Control and Prevention (CDC) preliminary description of 2449 cases. We found substantial precision gains from using covariate adjustment-equivalent to 9-21% reductions in the required sample size to achieve a desired power-for a variety of estimands (targets of inference) when the trial sample size was at least 200. We provide an R package and practical recommendations for implementing covariate adjustment. The estimators that we consider are robust to model misspecification.


2011 ◽  
Vol 65 ◽  
pp. 291-294
Author(s):  
Yao Hua Wang ◽  
Liang Wang ◽  
Hai Shan Yang ◽  
Bao Guo Zhu

In order to solve the problem which generally exists in assessing high explosive ignition reliability of electro-explosive devices (EED), a new test method, based on information equivalence principle, is proposed on the condition of a relatively smaller sample size for instead. According to the definition of information principle, the method measures the reliability test information by the negative logarithm of ignition probability of EED and converts the test by GJB376-1987 at a larger amount of stimulation with a big sample size to a small one. We adopt this method to assess the ignition reliability of EED used in the emergency opening system. The result is that we just need 29 sample size on the confidence of not less than 95% and the ignition reliability greater than 0.999. Compared with the 2996 sample size in GJB376-1987, the method reduces the sample usage greatly. Tests shows that the small sample test method based on information equivalence principle for the ignition reliability test of EED is accurate, feasible and can meet the objective of experimental design


1988 ◽  
Vol 13 (3) ◽  
pp. 142-146 ◽  
Author(s):  
David A. Cole

In the area of severe-profound retardation, researchers are faced with small sample sizes. The question of statistical power is critical. In this article, three commonly used tests for treatment-control group differences are compared with respect to their relative power: the posttest-only approach, the change-score approach, and an analysis of covariance (ANCOVA) approach. In almost all cases, the ANCOVA approach is the more powerful than the other two, even when very small samples are involved. Finally, a fourth approach involving ANCOVA plus alternate rank assignments is examined and found to be superior even to the ANCOVA approach, especially in small sample cases. Use of slightly more sophisticated statistics in small sample research is recommended.


Author(s):  
J. Mullaert ◽  
M. Bouaziz ◽  
Y. Seeleuthner ◽  
B. Bigio ◽  
J-L. Casanova ◽  
...  

AbstractMany methods for rare variant association studies require permutations to assess the significance of tests. Standard permutations assume that all individuals are exchangeable and do not take population stratification (PS), a known confounding factor in genetic studies, into account. We propose a novel strategy, LocPerm, in which individuals are permuted only with their closest ancestry-based neighbors. We performed a simulation study, focusing on small samples, to evaluate and compare LocPerm with standard permutations and classical adjustment on first principal components. Under the null hypothesis, LocPerm was the only method providing an acceptable type I error, regardless of sample size and level of stratification. The power of LocPerm was similar to that of standard permutation in the absence of PS, and remained stable in different PS scenarios. We conclude that LocPerm is a method of choice for taking PS and/or small sample size into account in rare variant association studies.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Vahid Ebrahimi ◽  
Zahra Bagheri ◽  
Zahra Shayan ◽  
Peyman Jafari

Assessing differential item functioning (DIF) using the ordinal logistic regression (OLR) model highly depends on the asymptotic sampling distribution of the maximum likelihood (ML) estimators. The ML estimation method, which is often used to estimate the parameters of the OLR model for DIF detection, may be substantially biased with small samples. This study is aimed at proposing a new application of the elastic net regularized OLR model, as a special type of machine learning method, for assessing DIF between two groups with small samples. Accordingly, a simulation study was conducted to compare the powers and type I error rates of the regularized and nonregularized OLR models in detecting DIF under various conditions including moderate and severe magnitudes of DIF ( DIF = 0.4   and   0.8 ), sample size ( N ), sample size ratio ( R ), scale length ( I ), and weighting parameter ( w ). The simulation results revealed that for I = 5 and regardless of R , the elastic net regularized OLR model with w = 0.1 , as compared with the nonregularized OLR model, increased the power of detecting moderate uniform DIF ( DIF = 0.4 ) approximately 35% and 21% for N = 100   and   150 , respectively. Moreover, for I = 10 and severe uniform DIF ( DIF = 0.8 ), the average power of the elastic net regularized OLR model with 0.03 ≤ w ≤ 0.06 , as compared with the nonregularized OLR model, increased approximately 29.3% and 11.2% for N = 100   and   150 , respectively. In these cases, the type I error rates of the regularized and nonregularized OLR models were below or close to the nominal level of 0.05. In general, this simulation study showed that the elastic net regularized OLR model outperformed the nonregularized OLR model especially in extremely small sample size groups. Furthermore, the present research provided a guideline and some recommendations for researchers who conduct DIF studies with small sample sizes.


Sign in / Sign up

Export Citation Format

Share Document