prior variance
Recently Published Documents


TOTAL DOCUMENTS

8
(FIVE YEARS 4)

H-INDEX

3
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Sareh Nabi ◽  
Houssam Nassif ◽  
Joseph Hong ◽  
Hamed Mamani ◽  
Guido Imbens

Adding domain knowledge to a learning system is known to improve results. In multiparameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, the various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in operation management and management science applications are the absence of informative priors and the inability to control parameter learning rates. In this study, we propose a hierarchical empirical Bayes approach that addresses both challenges and that can generalize to any Bayesian framework. Our method learns empirical meta-priors from the data itself and uses them to decouple the learning rates of first-order and second-order features (or any other given feature grouping) in a generalized linear model. Because the first-order features are likely to have a more pronounced effect on the outcome, focusing on learning first-order weights first is likely to improve performance and convergence time. Our empirical Bayes method clamps features in each group together and uses the deployed model’s observed data to empirically compute a hierarchical prior in hindsight. We report theoretical results for the unbiasedness, strong consistency, and optimal frequentist cumulative regret properties of our meta-prior variance estimator. We apply our method to a standard supervised learning optimization problem as well as an online combinatorial optimization problem in a contextual bandit setting implemented in an Amazon production system. During both simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising because optimizing over sparse data is often a challenge. This paper was accepted by Hamid Nazerzadeh, special issue on data-driven prescriptive analytics.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Andrew T. Ching ◽  
Ignatius Horstmann ◽  
Hyunwoo Lim

Abstract In “Marketing Information: A Competitive Analysis,” Sarvary, M., and P. M. Parker. 1997. “Marketing Information: A Competitive Analysis.” Marketing Science 16 (1): 24–38 (S&P) argue that in part of the parameter space that they considered, a reduction in the price of one information product can lead to an increase in demand for another information product, i.e. information products can be gross complements. This result is surprising and has potentially important marketing implications. We show that S&P obtain this complementarity result by implicitly making the following internally inconsistent assumptions: (i) after purchasing information products, consumers update their beliefs using a Bayesian updating rule that assumes they have a diffuse initial prior (i.e. their initial prior variance is ∞ before receiving any information); (ii) if consumers choose not to purchase any information product, it is assumed that their initial prior variance is 1 (implied by the utility function specification). This internal inconsistency leads to the possibility that when information products are uncorrelated and their variances are close to 1, marginal utility is increasing in the number of products purchased, and hence information products can be complements in their model. We show that if we remove this internal inconsistency, in the parameter space considered by S&P, information products cannot be complements because the marginal utility of information products will be diminishing. We also show that, in parts of the parameter space not considered by S&P, it is possible that information products are complements; this space of parameters requires consumer’s initial prior to be relatively precise and information products to be highly correlated (either positively or negatively).


Author(s):  
Olawale B. Akanbi ◽  
Olusanya E. Olubusoye ◽  
Oluwaseun O. Odeyemi

This study examines the sensitivity of the posterior mean to change in the prior assumptions. Three plausible choices of prior which include informative, relative-non informative and non-informative priors are considered. The paper considers information level for a prior to cause a notable change in the Bayesian posterior point estimate. The study develops a framework for evaluating a bound for a robust posterior point estimate. The Ellipsoid Bound theorem is employed to derive the Ellipsoid Bound for an independent normal gamma prior distribution. The proposed modification ellipsoid bound for the large prior was establised by varrying different variance co-variance sizes for the independent normal gamma prior. This bound represents the range for the posterior mean when is insensitive and when it’s sensitive in both location and spread. The result shows that; for a large prior parameter value (greater than the OLS estimate) with a positive definite prior variance covariance matrix, and prior parameter values interval which contains the OLS estimate then, the posterior estimate will be less than both the OLS and the prior estimates. Similarly, if the lower bound of the prior parameter values range is greater than the OLS estimate then: The posterior estimate will be greater than the OLS estimate but smaller than the prior estimate. Furthermore, it is observed that no matter the degrees of confidence in the prior values, data information is powerful enough to modify it.


2020 ◽  
Author(s):  
Michael Cai ◽  
Marco Del Negro ◽  
Edward Herbst ◽  
Ethan Matlin ◽  
Reca Sarfati ◽  
...  

Summary This paper illustrates the usefulness of sequential Monte Carlo (SMC) methods in approximating dynamic stochastic general equilibrium (DSGE) model posterior distributions. We show how the tempering schedule can be chosen adaptively, document the accuracy and runtime benefits of generalized data tempering for ‘online’ estimation (that is, re-estimating a model as new data become available), and provide examples of multimodal posteriors that are well captured by SMC methods. We then use the online estimation of the DSGE model to compute pseudo-out-of-sample density forecasts and study the sensitivity of the predictive performance to changes in the prior distribution. We find that making priors less informative (compared with the benchmark priors used in the literature) by increasing the prior variance does not lead to a deterioration of forecast accuracy.


2018 ◽  
Vol 146 (11) ◽  
pp. 3605-3622 ◽  
Author(s):  
Elizabeth A. Satterfield ◽  
Daniel Hodyss ◽  
David D. Kuhl ◽  
Craig H. Bishop

Abstract Because of imperfections in ensemble data assimilation schemes, one cannot assume that the ensemble-derived covariance matrix is equal to the true error covariance matrix. Here, we describe a simple and intuitively compelling method to fit calibration functions of the ensemble sample variance to the mean of the distribution of true error variances, given an ensemble estimate. We demonstrate that the use of such calibration functions is consistent with theory showing that, when sampling error in the prior variance estimate is considered, the gain that minimizes the posterior error variance uses the expected true prior variance, given an ensemble sample variance. Once the calibration function has been fitted, it can be combined with ensemble-based and climatologically based error correlation information to obtain a generalized hybrid error covariance model. When the calibration function is chosen to be a linear function of the ensemble variance, the generalized hybrid error covariance model is the widely used linear hybrid consisting of a weighted sum of a climatological and an ensemble-based forecast error covariance matrix. However, when the calibration function is chosen to be, say, a cubic function of the ensemble sample variance, the generalized hybrid error covariance model is a nonlinear function of the ensemble estimate. We consider idealized univariate data assimilation and multivariate cycling ensemble data assimilation to demonstrate that the generalized hybrid error covariance model closely approximates the optimal weights found through computationally expensive tuning in the linear case and, in the nonlinear case, outperforms any plausible linear model.


2016 ◽  
Vol 27 (12) ◽  
pp. 1562-1572 ◽  
Author(s):  
Georgie Powell ◽  
Zoe Meredith ◽  
Rebecca McMillin ◽  
Tom C. A. Freeman

According to Bayesian models, perception and cognition depend on the optimal combination of noisy incoming evidence with prior knowledge of the world. Individual differences in perception should therefore be jointly determined by a person’s sensitivity to incoming evidence and his or her prior expectations. It has been proposed that individuals with autism have flatter prior distributions than do nonautistic individuals, which suggests that prior variance is linked to the degree of autistic traits in the general population. We tested this idea by studying how perceived speed changes during pursuit eye movement and at low contrast. We found that individual differences in these two motion phenomena were predicted by differences in thresholds and autistic traits when combined in a quantitative Bayesian model. Our findings therefore support the flatter-prior hypothesis and suggest that individual differences in prior expectations are more systematic than previously thought. In order to be revealed, however, individual differences in sensitivity must also be taken into account.


2012 ◽  
Vol 32 (13) ◽  
pp. 2221-2234 ◽  
Author(s):  
Jin Zhang ◽  
Thomas M. Braun ◽  
Jeremy M.G. Taylor

Sign in / Sign up

Export Citation Format

Share Document