Modeling Context-Dependent Latent Effect Heterogeneity

2019 ◽  
Vol 28 (1) ◽  
pp. 20-46
Author(s):  
Diogo Ferrari

Classical generalized linear models assume that marginal effects are homogeneous in the population given the observed covariates. Researchers can never be sure a priori if that assumption is adequate. Recent literature in statistics and political science have proposed models that use Dirichlet process priors to deal with the possibility of latent heterogeneity in the covariate effects. In this paper, we extend and generalize those approaches and propose a hierarchical Dirichlet process of generalized linear models in which the latent heterogeneity can depend on context-level features. Such a model is important in comparative analyses when the data comes from different countries and the latent heterogeneity can be a function of country-level features. We provide a Gibbs sampler for the general model, a special Gibbs sampler for gaussian outcome variables, and a Hamiltonian Monte Carlo within Gibbs to handle discrete outcome variables. We demonstrate the importance of accounting for latent heterogeneity with a Monte Carlo exercise and with two applications that replicate recent scholarly work. We show how Simpson’s paradox can emerge in the empirical analysis if latent heterogeneity is ignored and how the proposed model can be used to estimate heterogeneity in the effect of covariates.

2006 ◽  
Vol 36 (1) ◽  
pp. 121-133 ◽  
Author(s):  
Esbjörn Ohlsson ◽  
Björn Johansson

Kaas, Dannenburg & Goovaerts (1997) generalized Jewell’s theorem on exact credibility, from the classical Bühlmann model to the (weighted) Bühlmann-Straub model. We extend this result further to the “Bühlmann-Straub model with a priori differences” (Bühlmann & Gisler, 2005). It turns out that exact credibility holds for a class of Tweedie models, including the Poisson, gamma and compound Poisson distribution – the most important distributions for insurance applications of generalized linear models (GLMs). Our results can also be viewed as an alternative to the HGLM approach for combining credibility and GLMs, see Nelder and Verrall (1997).


Author(s):  
Philip Odonkor ◽  
Kemper Lewis

Optimization research on operational strategies of energy use in building clusters have generally marginalized the effects of uncertainty in favor of reduced computational expense. This however leads to a significant disconnect between the expected energy cost and the average cost observed under uncertainty. Bridging this divide requires the incorporation of uncertainty analysis which poses both technical and computational challenges. This paper addresses these challenges through the notion of a Pareto band, demonstrating its applicability towards developing resilient operational strategies in a timely and computationally efficient manner. Under the proposed approach, Monte Carlo simulations are leveraged to reveal an envelope of optimality contained within the energy cost solution space. This optimality envelope, formally introduced as a Pareto band, is then used to train generalized linear models (GLMs) enabling robust operational strategy predictions. The results obtained from this approach highlight significant improvements in energy cost performance under uncertainty.


Safety ◽  
2018 ◽  
Vol 4 (4) ◽  
pp. 57 ◽  
Author(s):  
Fatemeh Davoudi Kakhki ◽  
Steven Freeman ◽  
Gretchen Mosher

Insurance practitioners rely on statistical models to predict future claims in order to provide financial protection. Proper predictive statistical modeling is more challenging when analyzing claims with lower frequency, but high costs. The paper investigated the use of predictive generalized linear models (GLMs) to address this challenge. Workers’ compensation claims with costs equal to or more than US$100,000 were analyzed in agribusiness industries in the Midwest of the USA from 2008 to 2016. Predictive GLMs were built with gamma, Weibull, and lognormal distributions using the lasso penalization method. Monte Carlo simulation models were developed to check the performance of predictive models in cost estimation. The results show that the GLM with gamma distribution has the highest predictivity power (R2 = 0.79). Injury characteristics and worker’s occupation were predictive of large claims’ occurrence and costs. The conclusions of this study are useful in modifying and estimating insurance pricing within high-risk agribusiness industries. The approach of this study can be used as a framework to forecast workers’ compensation claims amounts with rare, high-cost events in other industries. This work is useful for insurance practitioners concerned with statistical and predictive modeling in financial risk analysis.


Sign in / Sign up

Export Citation Format

Share Document