Estimating group fixed effects in panel data with a binary dependent variable: How the LPM outperforms logistic regression in rare events data

2021 ◽  
Vol 93 ◽  
pp. 102486
Author(s):  
Joan C. Timoneda
2018 ◽  
Author(s):  
Paul D Allison

Standard fixed effects methods presume that effects of variables are symmetric: the effect of increasing a variable is the same as the effect of decreasing that variable but in the opposite direction. This is implausible for many social phenomena. York and Light (2017) showed how to estimate asymmetric models by estimating first-difference regressions in which the difference scores for the predictors are decomposed into positive and negative changes. In this paper, I show that there are several aspects of their method that need improvement. I also develop a data generating model that justifies the first-difference method but can be applied in more general settings. In particular, it can be used to construct asymmetric logistic regression models.


2019 ◽  
Vol 5 ◽  
pp. 237802311982644 ◽  
Author(s):  
Paul D. Allison

Standard fixed-effects methods presume that effects of variables are symmetric: The effect of increasing a variable is the same as the effect of decreasing that variable but in the opposite direction. This is implausible for many social phenomena. York and Light showed how to estimate asymmetric models by estimating first-difference regressions in which the difference scores for the predictors are decomposed into positive and negative changes. In this article, I show that there are several aspects of their method that need improvement. I also develop a data-generating model that justifies the first-difference method but can be applied in more general settings. In particular, it can be used to construct asymmetric logistic regression models.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Marjan Faghih ◽  
Zahra Bagheri ◽  
Dejan Stevanovic ◽  
Seyyed Mohhamad Taghi Ayatollahi ◽  
Peyman Jafari

The logistic regression (LR) model for assessing differential item functioning (DIF) is highly dependent on the asymptotic sampling distributions. However, for rare events data, the maximum likelihood estimation method may be biased and the asymptotic distributions may not be reliable. In this study, the performance of the regular maximum likelihood (ML) estimation is compared with two bias correction methods including weighted logistic regression (WLR) and Firth's penalized maximum likelihood (PML) to assess DIF for imbalanced or rare events data. The power and type I error rate of the LR model for detecting DIF were investigated under different combinations of sample size, moderate and severe magnitudes of uniform DIF (DIF = 0.4 and 0.8), sample size ratio, number of items, and the imbalanced degree (τ). Indeed, as compared with WLR and for severe imbalanced degree (τ = 0.069), there were reductions of approximately 30% and 24% under DIF = 0.4 and 27% and 23% under DIF = 0.8 in the power of the PML and ML, respectively. The present study revealed that the WLR outperforms both the ML and PML estimation methods when logistic regression is used to evaluate DIF for imbalanced or rare events data.


2001 ◽  
Vol 9 (2) ◽  
pp. 137-163 ◽  
Author(s):  
Gary King ◽  
Langche Zeng

We study rare events data, binary dependent variables with dozens to thousands of times fewer ones (events, such as wars, vetoes, cases of political activism, or epidemiological infections) than zeros (“nonevents”). In many literatures, these variables have proven difficult to explain and predict, a problem that seems to have at least two sources. First, popular statistical procedures, such as logistic regression, can sharply underestimate the probability of rare events. We recommend corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. Second, commonly used data collection strategies are grossly inefficient for rare events data. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables, such as in international conflict data with more than a quarter-million dyads, only a few of which are at war. As it turns out, more efficient sampling designs exist for making valid inferences, such as sampling all available events (e.g., wars) and a tiny fraction of nonevents (peace). This enables scholars to save as much as 99% of their (nonfixed) data collection costs or to collect much more meaningful explanatory variables. We provide methods that link these two results, enabling both types of corrections to work simultaneously, and software that implements the methods developed.


2018 ◽  
Vol 16 (1) ◽  
pp. 5
Author(s):  
Dante Mendes Aldrighi ◽  
Fernando Antonio Slaibe Postali ◽  
Maria Dolores Montoya Diaz

The literature has not reached a consensus on the motivation and implications of pyramidal ownership schemes. For some, such arrangements make it easier for controlling shareholders to expropriate outside investors. More recently, some studies have challenged this view and emphasized that their rationale lies in overcoming financial constraints. This paper focuses on whether firms owned through pyramidal schemes are more likely to be listed on the “Novo Mercado,” the Brazilian stock exchange’s premium listing segment created in 2000, which prohibits firms from issuing non-voting shares. We built a dataset of ownership data with annual observations for a panel of firms over the period 2003-2010 by hand-collecting data drawn from reports that firms submit periodically to the Brazilian securities regulator (CVM). Estimating fixed effects non-linear panel data models of a binary dependent variable, we find that firms listed on the Novo Mercado are less likely to be owned through a pyramid arrangement, result which appears to be consistent with the expropriation view.


2018 ◽  
Vol 8 (1) ◽  
pp. 92-105 ◽  
Author(s):  
Scott J. Cook ◽  
Jude C. Hays ◽  
Robert J. Franzese

AbstractMost agree that models of binary time-series-cross-sectional data in political science often possess unobserved unit-level heterogeneity. Despite this, there is no clear consensus on how best to account for these potential unit effects, with many of the issues confronted seemingly misunderstood. For example, one oft-discussed concern with rare events data is the elimination of no-event units from the sample when estimating fixed effects models. Many argue that this is a reason to eschew fixed effects in favor of pooled or random effects models. We revisit this issue and clarify that the main concern with fixed effects models of rare events data is not inaccurate or inefficient coefficient estimation, but instead biased marginal effects. In short, only evaluating event-experiencing units gives an inaccurate estimate of the baseline risk, yielding inaccurate (often inflated) estimates of predictor effects. As a solution, we propose a penalized maximum likelihood fixed effects (PML-FE) estimator, which retains the complete sample by providing finite estimates of the fixed effects for each unit. We explore the small sample performance of PML-FE versus common alternatives via Monte Carlo simulations, evaluating the accuracy of both parameter and effects estimates. Finally, we illustrate our method with a model of civil war onset.


Sign in / Sign up

Export Citation Format

Share Document