scholarly journals Improving Predictions using Ensemble Bayesian Model Averaging

2012 ◽  
Vol 20 (3) ◽  
pp. 271-291 ◽  
Author(s):  
Jacob M. Montgomery ◽  
Florian M. Hollenbach ◽  
Michael D. Ward

We present ensemble Bayesian model averaging (EBMA) and illustrate its ability to aid scholars in the social sciences to make more accurate forecasts of future events. In essence, EBMA improves prediction by pooling information from multiple forecast models to generate ensemble predictions similar to a weighted average of component forecasts. The weight assigned to each forecast is calibrated via its performance in some validation period. The aim is not to choose some “best” model, but rather to incorporate the insights and knowledge implicit in various forecasting efforts via statistical postprocessing. After presenting the method, we show that EBMA increases the accuracy of out-of-sample forecasts relative to component models in three applied examples: predicting the occurrence of insurgencies around the Pacific Rim, forecasting vote shares in U.S. presidential elections, and predicting the votes of U.S. Supreme Court Justices.

2019 ◽  
Vol 220 (2) ◽  
pp. 1368-1378
Author(s):  
M Bertin ◽  
S Marin ◽  
C Millet ◽  
C Berge-Thierry

SUMMARY In low-seismicity areas such as Europe, seismic records do not cover the whole range of variable configurations required for seismic hazard analysis. Usually, a set of empirical models established in such context (the Mediterranean Basin, northeast U.S.A., Japan, etc.) is considered through a logic-tree-based selection process. This approach is mainly based on the scientist’s expertise and ignores the uncertainty in model selection. One important and potential consequence of neglecting model uncertainty is that we assign more precision to our inference than what is warranted by the data, and this leads to overly confident decisions and precision. In this paper, we investigate the Bayesian model averaging (BMA) approach, using nine ground-motion prediction equations (GMPEs) issued from several databases. The BMA method has become an important tool to deal with model uncertainty, especially in empirical settings with large number of potential models and relatively limited number of observations. Two numerical techniques, based on the Markov chain Monte Carlo method and the maximum likelihood estimation approach, for implementing BMA are presented and applied together with around 1000 records issued from the RESORCE-2013 database. In the example considered, it is shown that BMA provides both a hierarchy of GMPEs and an improved out-of-sample predictive performance.


Author(s):  
Giuseppe De Luca ◽  
Jan R. Magnus

In this article, we describe the estimation of linear regression models with uncertainty about the choice of the explanatory variables. We introduce the Stata commands bma and wals, which implement, respectively, the exact Bayesian model-averaging estimator and the weighted-average least-squares estimator developed by Magnus, Powell, and Prüfer (2010, Journal of Econometrics 154: 139–153). Unlike standard pretest estimators that are based on some preliminary diagnostic test, these model-averaging estimators provide a coherent way of making inference on the regression parameters of interest by taking into account the uncertainty due to both the estimation and the model selection steps. Special emphasis is given to several practical issues that users are likely to face in applied work: equivariance to certain transformations of the explanatory variables, stability, accuracy, computing speed, and out-of-memory problems. Performances of our bma and wals commands are illustrated using simulated data and empirical applications from the literature on model-averaging estimation.


2010 ◽  
Vol 138 (1) ◽  
pp. 190-202 ◽  
Author(s):  
Chris Fraley ◽  
Adrian E. Raftery ◽  
Tilmann Gneiting

Abstract Bayesian model averaging (BMA) is a statistical postprocessing technique that generates calibrated and sharp predictive probability density functions (PDFs) from forecast ensembles. It represents the predictive PDF as a weighted average of PDFs centered on the bias-corrected ensemble members, where the weights reflect the relative skill of the individual members over a training period. This work adapts the BMA approach to situations that arise frequently in practice; namely, when one or more of the member forecasts are exchangeable, and when there are missing ensemble members. Exchangeable members differ in random perturbations only, such as the members of bred ensembles, singular vector ensembles, or ensemble Kalman filter systems. Accounting for exchangeability simplifies the BMA approach, in that the BMA weights and the parameters of the component PDFs can be assumed to be equal within each exchangeable group. With these adaptations, BMA can be applied to postprocess multimodel ensembles of any composition. In experiments with surface temperature and quantitative precipitation forecasts from the University of Washington mesoscale ensemble and ensemble Kalman filter systems over the Pacific Northwest, the proposed extensions yield good results. The BMA method is robust to exchangeability assumptions, and the BMA postprocessed combined ensemble shows better verification results than any of the individual, raw, or BMA postprocessed ensemble systems. These results suggest that statistically postprocessed multimodel ensembles can outperform individual ensemble systems, even in cases in which one of the constituent systems is superior to the others.


2007 ◽  
Vol 135 (9) ◽  
pp. 3209-3220 ◽  
Author(s):  
J. Mc Lean Sloughter ◽  
Adrian E. Raftery ◽  
Tilmann Gneiting ◽  
Chris Fraley

Abstract Bayesian model averaging (BMA) is a statistical way of postprocessing forecast ensembles to create predictive probability density functions (PDFs) for weather quantities. It represents the predictive PDF as a weighted average of PDFs centered on the individual bias-corrected forecasts, where the weights are posterior probabilities of the models generating the forecasts and reflect the forecasts’ relative contributions to predictive skill over a training period. It was developed initially for quantities whose PDFs can be approximated by normal distributions, such as temperature and sea level pressure. BMA does not apply in its original form to precipitation, because the predictive PDF of precipitation is nonnormal in two major ways: it has a positive probability of being equal to zero, and it is skewed. In this study BMA is extended to probabilistic quantitative precipitation forecasting. The predictive PDF corresponding to one ensemble member is a mixture of a discrete component at zero and a gamma distribution. Unlike methods that predict the probability of exceeding a threshold, BMA gives a full probability distribution for future precipitation. The method was applied to daily 48-h forecasts of 24-h accumulated precipitation in the North American Pacific Northwest in 2003–04 using the University of Washington mesoscale ensemble. It yielded predictive distributions that were calibrated and sharp. It also gave probability of precipitation forecasts that were much better calibrated than those based on consensus voting of the ensemble members. It gave better estimates of the probability of high-precipitation events than logistic regression on the cube root of the ensemble mean.


2011 ◽  
Vol 139 (5) ◽  
pp. 1626-1636 ◽  
Author(s):  
Richard M. Chmielecki ◽  
Adrian E. Raftery

Bayesian model averaging (BMA) is a statistical postprocessing technique that has been used in probabilistic weather forecasting to calibrate forecast ensembles and generate predictive probability density functions (PDFs) for weather quantities. The authors apply BMA to probabilistic visibility forecasting using a predictive PDF that is a mixture of discrete point mass and beta distribution components. Three approaches to developing predictive PDFs for visibility are developed, each using BMA to postprocess an ensemble of visibility forecasts. In the first approach, the ensemble is generated by a translation algorithm that converts predicted concentrations of hydrometeorological variables into visibility. The second approach augments the raw ensemble visibility forecasts with model forecasts of relative humidity and quantitative precipitation. In the third approach, the ensemble members are generated from relative humidity and precipitation alone. These methods are applied to 12-h ensemble forecasts from 2007 to 2008 and are tested against verifying observations recorded at Automated Surface Observing Stations in the Pacific Northwest. Each of the three methods produces predictive PDFs that are calibrated and sharp with respect to both climatology and the raw ensemble.


2007 ◽  
Vol 135 (4) ◽  
pp. 1364-1385 ◽  
Author(s):  
Laurence J. Wilson ◽  
Stephane Beauregard ◽  
Adrian E. Raftery ◽  
Richard Verret

Abstract Bayesian model averaging (BMA) has recently been proposed as a way of correcting underdispersion in ensemble forecasts. BMA is a standard statistical procedure for combining predictive distributions from different sources. The output of BMA is a probability density function (pdf), which is a weighted average of pdfs centered on the bias-corrected forecasts. The BMA weights reflect the relative contributions of the component models to the predictive skill over a training sample. The variance of the BMA pdf is made up of two components, the between-model variance, and the within-model error variance, both estimated from the training sample. This paper describes the results of experiments with BMA to calibrate surface temperature forecasts from the 16-member Canadian ensemble system. Using one year of ensemble forecasts, BMA was applied for different training periods ranging from 25 to 80 days. The method was trained on the most recent forecast period, then applied to the next day’s forecasts as an independent sample. This process was repeated through the year, and forecast quality was evaluated using rank histograms, the continuous rank probability score, and the continuous rank probability skill score. An examination of the BMA weights provided a useful comparative evaluation of the component models, both for the ensemble itself and for the ensemble augmented with the unperturbed control forecast and the higher-resolution deterministic forecast. Training periods around 40 days provided a good calibration of the ensemble dispersion. Both full regression and simple bias-correction methods worked well to correct the bias, except that the full regression failed to completely remove seasonal trend biases in spring and fall. Simple correction of the bias was sufficient to produce positive forecast skill out to 10 days with respect to climatology, which was improved by the BMA. The addition of the control forecast and the full-resolution model forecast to the ensemble produced modest improvement in the forecasts for ranges out to about 7 days. Finally, BMA produced significantly narrower 90% prediction intervals compared to a simple Gaussian bias correction, while achieving similar overall accuracy.


2020 ◽  
Vol 12 (24) ◽  
pp. 4009
Author(s):  
Khalil Ur Rahman ◽  
Songhao Shang

Substantial uncertainties are associated with satellite precipitation datasets (SPDs), which are further amplified over complex terrain and diverse climate regions. The current study develops a regional blended precipitation dataset (RBPD) over Pakistan from selected SPDs in different regions using a dynamic weighted average least squares (WALS) algorithm from 2007 to 2018 with 0.25° spatial resolution and one-day temporal resolution. Several SPDs, including Global Precipitation Measurement (GPM)-based Integrated Multi-Satellite Retrievals for GPM (IMERG), Tropical Rainfall Measurement Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA) 3B42-v7, Precipitation Estimates from Remotely Sensed Information Using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR), ERA-Interim (reanalysis dataset), SM2RAIN-CCI, and SM2RAIN-ASCAT are evaluated to select appropriate blending SPDs in different climate regions. Six statistical indices, including mean bias (MB), mean absolute error (MAE), unbiased root mean square error (ubRMSE), correlation coefficient (R), Kling–Gupta efficiency (KGE), and Theil’s U coefficient, are used to assess the WALS-RBPD performance over 102 rain gauges (RGs) in Pakistan. The results showed that WALS-RBPD had assigned higher weights to IMERG in the glacial, humid, and arid regions, while SM2RAIN-ASCAT had higher weights across the hyper-arid region. The average weights of IMERG (SM2RAIN-ASCAT) are 29.03% (23.90%), 30.12% (24.19%), 31.30% (27.84%), and 27.65% (32.02%) across glacial, humid, arid, and hyper-arid regions, respectively. IMERG dominated monsoon and pre-monsoon seasons with average weights of 34.87% and 31.70%, while SM2RAIN-ASCAT depicted high performance during post-monsoon and winter seasons with average weights of 37.03% and 38.69%, respectively. Spatial scale evaluation of WALS-RPBD resulted in relatively poorer performance at high altitudes (glacial and humid regions), whereas better performance in plain areas (arid and hyper-arid regions). Moreover, temporal scale performance assessment depicted poorer performance during intense precipitation seasons (monsoon and pre-monsoon) as compared with post-monsoon and winter seasons. Skill scores are used to quantify the improvements of WALS-RBPD against previously developed blended precipitation datasets (BPDs) based on WALS (WALS-BPD), dynamic clustered Bayesian model averaging (DCBA-BPD), and dynamic Bayesian model averaging (DBMA-BPD). On the one hand, skill scores show relatively low improvements of WALS-RBPD against WALS-BPD, where maximum improvements are observed in glacial (humid) regions with skill scores of 29.89% (28.69%) in MAE, 27.25% (23.89%) in ubRMSE, and 24.37% (28.95%) in MB. On the other hand, the highest improvements are observed against DBMA-BPD with average improvements across glacial (humid) regions of 39.74% (36.93%), 38.27% (33.06%), and 39.16% (30.47%) in MB, MAE, and ubRMSE, respectively. It is recommended that the development of RBPDs can be a potential alternative for data-scarce regions and areas with complex topography.


2021 ◽  
Vol 13 (9) ◽  
pp. 1662
Author(s):  
Khalil Ur Rahman ◽  
Songhao Shang ◽  
Muhammad Zohaib

The current study evaluates the potential of merged satellite precipitation datasets (MSPDs) against rain gauges (RGs) and satellite precipitation datasets (SPDs) in monitoring meteorological drought over Pakistan during 2000–2015. MSPDs evaluated in the current study include Regional Weighted Average Least Square (RWALS), Weighted Average Least Square (WALS), Dynamic Clustered Bayesian model Averaging (DCBA), and Dynamic Bayesian Model Averaging (DBMA) algorithms, while the set of SPDs is Global Precipitation Measurement (GPM)-based Integrated Multi-Satellite Retrievals for GPM (IMERG-V06), Tropical Rainfall Measurement Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA 3B42 V7), Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN), and ERA-Interim (re-analyses dataset). Several standardized precipitation indices (SPIs), including SPI-1, SPI-3, and SPI-12, are used to evaluate the performances of RGs, SPDs, and MSPDs across Pakistan as well as on a regional scale. The Mann–Kendall (MK) test is used to assess the trend of meteorological drought across different climate regions of Pakistan using these SPI indices. Results revealed higher performance of MSPDs than SPDs when compared against RGs for SPI estimates. The seasonal evaluation of SPIs from RGs, MSPDs, and SPDs in a representative drought year (2008) revealed mildly to moderate wetness in monsoon season while mild to moderate drought in winter season across Pakistan. However, the drought severity ranges from mild to severe drought in different years across different climate regions. MAPD (mean absolute percentage difference) shows high accuracy (MAPD <10%) for RWALS-MSPD, good accuracy (10% < MAPD <20%) for WALS-MSPD and DCBA-MSPD, while good to reasonable accuracy (20% < MAPD < 50%) for DCBA in different climate regions. Furthermore, MSPDs show a consistent drought trend as compared with RGs, while SPDs show poor performance. Overall, this study demonstrated significantly improved performance of MSPDs in monitoring the meteorological drought.


Sign in / Sign up

Export Citation Format

Share Document