scholarly journals How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It

2015 ◽  
Vol 23 (2) ◽  
pp. 159-179 ◽  
Author(s):  
Gary King ◽  
Margaret E. Roberts

“Robust standard errors” are used in a vast array of scholarship to correct standard errors for model misspecification. However, when misspecification is bad enough to make classical and robust standard errors diverge, assuming that it is nevertheless not so bad as to bias everything else requires considerable optimism. And even if the optimism is warranted, settling for a misspecified model, with or without robust standard errors, will still bias estimators of all but a few quantities of interest. The resulting cavernous gap between theory and practice suggests that considerable gains in applied statistics may be possible. We seek to help researchers realize these gains via a more productive way to understand and use robust standard errors; a new general and easier-to-use “generalized information matrix test” statistic that can formally assess misspecification (based on differences between robust and classical variance estimates); and practical illustrations via simulations and real examples from published research. How robust standard errors are used needs to change, but instead of jettisoning this popular tool we show how to use it to provide effective clues about model misspecification, likely biases, and a guide to considerably more reliable, and defensible, inferences. Accompanying this article is software that implements the methods we describe.

Econometrica ◽  
1991 ◽  
Vol 59 (3) ◽  
pp. 787 ◽  
Author(s):  
Andrew Chesher ◽  
Richard Spady

2018 ◽  
Vol 43 (6) ◽  
pp. 721-750
Author(s):  
Daphna Harel ◽  
Russell J. Steele

Collapsing categories is a commonly used data reduction technique; however, to date there do not exist principled methods to determine whether collapsing categories is appropriate in practice. With ordinal responses under the partial credit model, when collapsing categories, the true model for the collapsed data is no longer a partial credit model, and therefore refitting a partial credit model may result in model misspecification. This article details the implementation and performance of an information matrix test (IMT) to assess the implications of collapsing categories for a given data set under the partial credit model and compares its performance to the application of a nominal response model (NRM) and the S − X2 goodness-of-fit statistic. The IMT and NRM-based test are able to correctly determine the true number of categories for an item, given reasonable power through this goodness-of-fit test. We conclude by applying the test to a well-studied data set from the literature.


Methodology ◽  
2015 ◽  
Vol 11 (1) ◽  
pp. 3-12 ◽  
Author(s):  
Jochen Ranger ◽  
Jörg-Tobias Kuhn

In this manuscript, a new approach to the analysis of person fit is presented that is based on the information matrix test of White (1982) . This test can be interpreted as a test of trait stability during the measurement situation. The test follows approximately a χ2-distribution. In small samples, the approximation can be improved by a higher-order expansion. The performance of the test is explored in a simulation study. This simulation study suggests that the test adheres to the nominal Type-I error rate well, although it tends to be conservative in very short scales. The power of the test is compared to the power of four alternative tests of person fit. This comparison corroborates that the power of the information matrix test is similar to the power of the alternative tests. Advantages and areas of application of the information matrix test are discussed.


2020 ◽  
pp. 1-20
Author(s):  
Chad Hazlett ◽  
Leonard Wainstein

Abstract When working with grouped data, investigators may choose between “fixed effects” models (FE) with specialized (e.g., cluster-robust) standard errors, or “multilevel models” (MLMs) employing “random effects.” We review the claims given in published works regarding this choice, then clarify how these approaches work and compare by showing that: (i) random effects employed in MLMs are simply “regularized” fixed effects; (ii) unmodified MLMs are consequently susceptible to bias—but there is a longstanding remedy; and (iii) the “default” MLM standard errors rely on narrow assumptions that can lead to undercoverage in many settings. Our review of over 100 papers using MLM in political science, education, and sociology show that these “known” concerns have been widely ignored in practice. We describe how to debias MLM’s coefficient estimates, and provide an option to more flexibly estimate their standard errors. Most illuminating, once MLMs are adjusted in these two ways the point estimate and standard error for the target coefficient are exactly equal to those of the analogous FE model with cluster-robust standard errors. For investigators working with observational data and who are interested only in inference on the target coefficient, either approach is equally appropriate and preferable to uncorrected MLM.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Otávio Bartalotti

AbstractIn regression discontinuity designs (RD), for a given bandwidth, researchers can estimate standard errors based on different variance formulas obtained under different asymptotic frameworks. In the traditional approach the bandwidth shrinks to zero as sample size increases; alternatively, the bandwidth could be treated as fixed. The main theoretical results for RD rely on the former, while most applications in the literature treat the estimates as parametric, implementing the usual heteroskedasticity-robust standard errors. This paper develops the “fixed-bandwidth” alternative asymptotic theory for RD designs, which sheds light on the connection between both approaches. I provide alternative formulas (approximations) for the bias and variance of common RD estimators, and conditions under which both approximations are equivalent. Simulations document the improvements in test coverage that fixed-bandwidth approximations achieve relative to traditional approximations, especially when there is local heteroskedasticity. Feasible estimators of fixed-bandwidth standard errors are easy to implement and are akin to treating RD estimators aslocallyparametric, validating the common empirical practice of using heteroskedasticity-robust standard errors in RD settings. Bias mitigation approaches are discussed and a novel bootstrap higher-order bias correction procedure based on the fixed bandwidth asymptotics is suggested.


2021 ◽  
Author(s):  
Amanda Justine Lai ◽  
Ramya Ambikapathi ◽  
Oliver Cumming ◽  
Krisna Seng ◽  
Irene Velez ◽  
...  

Background Inadequate nutrition in early life and exposure to sanitation-related enteric pathogens have been linked to poor growth outcomes in children. Despite rapid development in Cambodia, high prevalence of growth faltering and stunting persist among children. This study aimed to assess nutrition and WASH variables and their association with nutritional status of children under 24 months in rural Cambodia. Methods We conducted surveys in 491 villages across 55 rural communes in Cambodia in September 2016 to measure associations between child, household, and community-level risk factors for stunting and length-for-age z-score (LAZ). A primary survey measured child-level variables, including anthropometric measures and risk factors for growth faltering and stunting, for 4,036 children under 24 months of age from 3,877 households (approximately 8 households per village). A secondary survey of 5,341 households, including the same households from the primary survey and an additional 1,464 households (approximately 3 additional household per village) from the same villages, assessed village-level WASH variables to understand community water, sanitation, and hygiene (WASH) conditions that may influence child growth outcomes. For LAZ, we calculated bivariate and adjusted associations (as mean differences) with 95% confidence intervals using generalized estimating equations (GEEs) to fit linear regression models with robust standard errors. For stunting, we calculated unadjusted and adjusted prevalence ratios (PRs) with 95% confidence intervals using GEEs to fit Poisson regression models with robust standard errors. For all models assessing effects of household-level variables, we used GEEs to account for clustering at the village level. Findings After adjustment for potential confounders, presence of water and soap at a household's handwashing station was found to be significantly associated (p<0.05) with increased LAZ (adjusted mean difference in LAZ +0.10, 95% CI 0.03 to 0.16), and household use of an improved drinking water source was associated with less stunting in children compared to households that did not use an improved source of drinking water (aPR 0.81, 95% CI 0.66 to 0.98); breastfeeding and community-level access to an improved drinking water source were associated with a lower LAZ score (-0.16, 95% CI -0.27 to -0.05; -0.13, 95% CI -0.26 to 0.00). No other nutrition (i.e., dietary diversity, meal frequency) or sanitation variables (i.e., household's safe disposal of child stools, household-level sanitation, community-level sanitation) were measured to be associated with LAZ scores or stunting in children under 24 months of age.


Sign in / Sign up

Export Citation Format

Share Document