scholarly journals Bayesian Model Weighting: The Many Faces of Model Averaging

Water ◽  
2020 ◽  
Vol 12 (2) ◽  
pp. 309
Author(s):  
Marvin Höge ◽  
Anneli Guthke ◽  
Wolfgang Nowak

Model averaging makes it possible to use multiple models for one modelling task, like predicting a certain quantity of interest. Several Bayesian approaches exist that all yield a weighted average of predictive distributions. However, often, they are not properly applied which can lead to false conclusions. In this study, we focus on Bayesian Model Selection (BMS) and Averaging (BMA), Pseudo-BMS/BMA and Bayesian Stacking. We want to foster their proper use by, first, clarifying their theoretical background and, second, contrasting their behaviours in an applied groundwater modelling task. We show that only Bayesian Stacking has the goal of model averaging for improved predictions by model combination. The other approaches pursue the quest of finding a single best model as the ultimate goal, and use model averaging only as a preliminary stage to prevent rash model choice. Improved predictions are thereby not guaranteed. In accordance with so-called M -settings that clarify the alleged relations between models and truth, we elicit which method is most promising.

2021 ◽  
Author(s):  
Marvin Höge ◽  
Anneli Guthke ◽  
Wolfgang Nowak

<p>In environmental modelling it is usually the case that multiple models are plausible, e.g. for predicting a certain quantity of interest. Using model rating methods, we typically want to elicit a single best one or the optimal average of these models. However, often, such methods are not properly applied which can lead to false conclusions.</p><p>At the examples of three different Bayesian approaches to model selection or averaging (namely 1. Bayesian Model Selection and Averaging (BMS/BMA), 2. Pseudo-BMS/BMA and 3. Bayesian Stacking), we show how very similarly looking methods pursue vastly different goals and lead to deviating results for model selection or averaging.</p><p>All three yield a weighted average of predictive distributions. Yet, only Bayesian Stacking has the goal of averaging for improved predictions in the sense of an actual (optimal) model combination. The other approaches pursue the quest of finding a single best model as the ultimate goal - yet, on different premises - and use model averaging only as a preliminary stage to prevent rash model choice.</p><p>We want to foster their proper use by, first, clarifying their theoretical background and, second, contrasting their behaviors in an applied groundwater modelling task. Third, we show how the insights gained from these Bayesian methods are transferrable to other (also non-Bayesian) model rating methods and we pose general conclusions about multi-model usage based on model weighting.</p><p> </p><p> </p>


Author(s):  
Giuseppe De Luca ◽  
Jan R. Magnus

In this article, we describe the estimation of linear regression models with uncertainty about the choice of the explanatory variables. We introduce the Stata commands bma and wals, which implement, respectively, the exact Bayesian model-averaging estimator and the weighted-average least-squares estimator developed by Magnus, Powell, and Prüfer (2010, Journal of Econometrics 154: 139–153). Unlike standard pretest estimators that are based on some preliminary diagnostic test, these model-averaging estimators provide a coherent way of making inference on the regression parameters of interest by taking into account the uncertainty due to both the estimation and the model selection steps. Special emphasis is given to several practical issues that users are likely to face in applied work: equivariance to certain transformations of the explanatory variables, stability, accuracy, computing speed, and out-of-memory problems. Performances of our bma and wals commands are illustrated using simulated data and empirical applications from the literature on model-averaging estimation.


2010 ◽  
Vol 138 (1) ◽  
pp. 190-202 ◽  
Author(s):  
Chris Fraley ◽  
Adrian E. Raftery ◽  
Tilmann Gneiting

Abstract Bayesian model averaging (BMA) is a statistical postprocessing technique that generates calibrated and sharp predictive probability density functions (PDFs) from forecast ensembles. It represents the predictive PDF as a weighted average of PDFs centered on the bias-corrected ensemble members, where the weights reflect the relative skill of the individual members over a training period. This work adapts the BMA approach to situations that arise frequently in practice; namely, when one or more of the member forecasts are exchangeable, and when there are missing ensemble members. Exchangeable members differ in random perturbations only, such as the members of bred ensembles, singular vector ensembles, or ensemble Kalman filter systems. Accounting for exchangeability simplifies the BMA approach, in that the BMA weights and the parameters of the component PDFs can be assumed to be equal within each exchangeable group. With these adaptations, BMA can be applied to postprocess multimodel ensembles of any composition. In experiments with surface temperature and quantitative precipitation forecasts from the University of Washington mesoscale ensemble and ensemble Kalman filter systems over the Pacific Northwest, the proposed extensions yield good results. The BMA method is robust to exchangeability assumptions, and the BMA postprocessed combined ensemble shows better verification results than any of the individual, raw, or BMA postprocessed ensemble systems. These results suggest that statistically postprocessed multimodel ensembles can outperform individual ensemble systems, even in cases in which one of the constituent systems is superior to the others.


2012 ◽  
Vol 20 (3) ◽  
pp. 271-291 ◽  
Author(s):  
Jacob M. Montgomery ◽  
Florian M. Hollenbach ◽  
Michael D. Ward

We present ensemble Bayesian model averaging (EBMA) and illustrate its ability to aid scholars in the social sciences to make more accurate forecasts of future events. In essence, EBMA improves prediction by pooling information from multiple forecast models to generate ensemble predictions similar to a weighted average of component forecasts. The weight assigned to each forecast is calibrated via its performance in some validation period. The aim is not to choose some “best” model, but rather to incorporate the insights and knowledge implicit in various forecasting efforts via statistical postprocessing. After presenting the method, we show that EBMA increases the accuracy of out-of-sample forecasts relative to component models in three applied examples: predicting the occurrence of insurgencies around the Pacific Rim, forecasting vote shares in U.S. presidential elections, and predicting the votes of U.S. Supreme Court Justices.


2007 ◽  
Vol 135 (9) ◽  
pp. 3209-3220 ◽  
Author(s):  
J. Mc Lean Sloughter ◽  
Adrian E. Raftery ◽  
Tilmann Gneiting ◽  
Chris Fraley

Abstract Bayesian model averaging (BMA) is a statistical way of postprocessing forecast ensembles to create predictive probability density functions (PDFs) for weather quantities. It represents the predictive PDF as a weighted average of PDFs centered on the individual bias-corrected forecasts, where the weights are posterior probabilities of the models generating the forecasts and reflect the forecasts’ relative contributions to predictive skill over a training period. It was developed initially for quantities whose PDFs can be approximated by normal distributions, such as temperature and sea level pressure. BMA does not apply in its original form to precipitation, because the predictive PDF of precipitation is nonnormal in two major ways: it has a positive probability of being equal to zero, and it is skewed. In this study BMA is extended to probabilistic quantitative precipitation forecasting. The predictive PDF corresponding to one ensemble member is a mixture of a discrete component at zero and a gamma distribution. Unlike methods that predict the probability of exceeding a threshold, BMA gives a full probability distribution for future precipitation. The method was applied to daily 48-h forecasts of 24-h accumulated precipitation in the North American Pacific Northwest in 2003–04 using the University of Washington mesoscale ensemble. It yielded predictive distributions that were calibrated and sharp. It also gave probability of precipitation forecasts that were much better calibrated than those based on consensus voting of the ensemble members. It gave better estimates of the probability of high-precipitation events than logistic regression on the cube root of the ensemble mean.


2016 ◽  
Vol 30 (15) ◽  
pp. 1541002
Author(s):  
Gianpiero Gervino ◽  
Giovanni Mana ◽  
Carlo Palmisano

In this paper, we consider the problems of identifying the most appropriate model for a given physical system and of assessing the model contribution to the measurement uncertainty. The above problems are studied in terms of Bayesian model selection and model averaging. As the evaluation of the “evidence” [Formula: see text], i.e., the integral of Likelihood × Prior over the space of the measurand and the parameters, becomes impracticable when this space has [Formula: see text] dimensions, it is necessary to consider an appropriate numerical strategy. Among the many algorithms for calculating [Formula: see text], we have investigated the ellipsoidal nested sampling, which is a technique based on three pillars: The study of the iso-likelihood contour lines of the integrand, a probabilistic estimate of the volume of the parameter space contained within the iso-likelihood contours and the random samplings from hyperellipsoids embedded in the integration variables. This paper lays out the essential ideas of this approach.


Author(s):  
Don van den Bergh ◽  
Merlise A. Clyde ◽  
Akash R. Komarlu Narendra Gupta ◽  
Tim de Jong ◽  
Quentin F. Gronau ◽  
...  

AbstractLinear regression analyses commonly involve two consecutive stages of statistical inquiry. In the first stage, a single ‘best’ model is defined by a specific selection of relevant predictors; in the second stage, the regression coefficients of the winning model are used for prediction and for inference concerning the importance of the predictors. However, such second-stage inference ignores the model uncertainty from the first stage, resulting in overconfident parameter estimates that generalize poorly. These drawbacks can be overcome by model averaging, a technique that retains all models for inference, weighting each model’s contribution by its posterior probability. Although conceptually straightforward, model averaging is rarely used in applied research, possibly due to the lack of easily accessible software. To bridge the gap between theory and practice, we provide a tutorial on linear regression using Bayesian model averaging in , based on the BAS package in . Firstly, we provide theoretical background on linear regression, Bayesian inference, and Bayesian model averaging. Secondly, we demonstrate the method on an example data set from the World Happiness Report. Lastly, we discuss limitations of model averaging and directions for dealing with violations of model assumptions.


2018 ◽  
Vol 10 (8) ◽  
pp. 2801 ◽  
Author(s):  
Krzysztof Drachal

Forecasting commodities prices on vividly changing markets is a hard problem to tackle. However, being able to determine important price predictors in a time-varying setting is crucial for sustainability initiatives. For example, the 2000s commodities boom gave rise to questioning whether commodities markets become over-financialized. In case of agricultural commodities, it was questioned if the speculative pressures increase food prices. Recently, some newly proposed Bayesian model combination scheme has been proposed, i.e., Dynamic Model Averaging (DMA). This method has already been applied with success in certain markets. It joins together uncertainty about the model and explanatory variables and a time-varying parameters approach. It can also capture structural breaks and respond to market disturbances. Secondly, it can deal with numerous explanatory variables in a data-rich environment. Similarly, like Bayesian Model Averaging (BMA), Dynamic Model Averaging (DMA), Dynamic Model Selection (DMS) and Median Probability Model (MED) start from Time-Varying Parameters’ (TVP) regressions. All of these methods were applied to 69 spot commodities prices. The period between Dec 1983 and Oct 2017 was analysed. In approximately 80% of cases, according to the Diebold–Mariano test, DMA produced statistically significant more accurate forecast than benchmark forecasts (like the naive method or ARIMA). Moreover, amongst all the considered model types, DMA was in 22% of cases the most accurate one (significantly). MED was most often minimising the forecast errors (28%). However, in the text, it is clarified that this was due to some specific initial parameters setting. The second ”best” model type was MED, meaning that, in the case of model selection, relying on the highest posterior probability is not always preferable.


2020 ◽  
Vol 12 (24) ◽  
pp. 4009
Author(s):  
Khalil Ur Rahman ◽  
Songhao Shang

Substantial uncertainties are associated with satellite precipitation datasets (SPDs), which are further amplified over complex terrain and diverse climate regions. The current study develops a regional blended precipitation dataset (RBPD) over Pakistan from selected SPDs in different regions using a dynamic weighted average least squares (WALS) algorithm from 2007 to 2018 with 0.25° spatial resolution and one-day temporal resolution. Several SPDs, including Global Precipitation Measurement (GPM)-based Integrated Multi-Satellite Retrievals for GPM (IMERG), Tropical Rainfall Measurement Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA) 3B42-v7, Precipitation Estimates from Remotely Sensed Information Using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR), ERA-Interim (reanalysis dataset), SM2RAIN-CCI, and SM2RAIN-ASCAT are evaluated to select appropriate blending SPDs in different climate regions. Six statistical indices, including mean bias (MB), mean absolute error (MAE), unbiased root mean square error (ubRMSE), correlation coefficient (R), Kling–Gupta efficiency (KGE), and Theil’s U coefficient, are used to assess the WALS-RBPD performance over 102 rain gauges (RGs) in Pakistan. The results showed that WALS-RBPD had assigned higher weights to IMERG in the glacial, humid, and arid regions, while SM2RAIN-ASCAT had higher weights across the hyper-arid region. The average weights of IMERG (SM2RAIN-ASCAT) are 29.03% (23.90%), 30.12% (24.19%), 31.30% (27.84%), and 27.65% (32.02%) across glacial, humid, arid, and hyper-arid regions, respectively. IMERG dominated monsoon and pre-monsoon seasons with average weights of 34.87% and 31.70%, while SM2RAIN-ASCAT depicted high performance during post-monsoon and winter seasons with average weights of 37.03% and 38.69%, respectively. Spatial scale evaluation of WALS-RPBD resulted in relatively poorer performance at high altitudes (glacial and humid regions), whereas better performance in plain areas (arid and hyper-arid regions). Moreover, temporal scale performance assessment depicted poorer performance during intense precipitation seasons (monsoon and pre-monsoon) as compared with post-monsoon and winter seasons. Skill scores are used to quantify the improvements of WALS-RBPD against previously developed blended precipitation datasets (BPDs) based on WALS (WALS-BPD), dynamic clustered Bayesian model averaging (DCBA-BPD), and dynamic Bayesian model averaging (DBMA-BPD). On the one hand, skill scores show relatively low improvements of WALS-RBPD against WALS-BPD, where maximum improvements are observed in glacial (humid) regions with skill scores of 29.89% (28.69%) in MAE, 27.25% (23.89%) in ubRMSE, and 24.37% (28.95%) in MB. On the other hand, the highest improvements are observed against DBMA-BPD with average improvements across glacial (humid) regions of 39.74% (36.93%), 38.27% (33.06%), and 39.16% (30.47%) in MB, MAE, and ubRMSE, respectively. It is recommended that the development of RBPDs can be a potential alternative for data-scarce regions and areas with complex topography.


Sign in / Sign up

Export Citation Format

Share Document