Summary Verification Measures and Their Interpretation for Ensemble Forecasts

Ensemble prediction systems produce forecasts that represent the probability distribution of a continuous forecast variable. Most often, the verification problem is simplified by transforming the ensemble forecast into probability forecasts for discrete events, where the events are defined by one or more threshold values. Then, skill is evaluated using the mean-square error (MSE; i.e., Brier) skill score for binary events, or the ranked probability skill score (RPSS) for multicategory events. A framework is introduced that generalizes this approach, by describing the forecast quality of ensemble forecasts as a continuous function of the threshold value. Viewing ensemble forecast quality this way leads to the interpretation of the RPSS and the continuous ranked probability skill score (CRPSS) as measures of the weighted-average skill over the threshold values. It also motivates additional measures, derived to summarize other features of a continuous forecast quality function, which can be interpreted as descriptions of the function’s geometric shape. The measures can be computed not only for skill, but also for skill score decompositions, which characterize the resolution, reliability, discrimination, and other aspects of forecast quality. Collectively, they provide convenient metrics for comparing the performance of an ensemble prediction system at different locations, lead times, or issuance times, or for comparing alternative forecasting systems.

Download Full-text

Simplifying a hydrological ensemble prediction system with a backward greedy selection of members – Part 1: Optimization criteria

Hydrology and Earth System Sciences Discussions ◽

10.5194/hessd-8-2739-2011 ◽

2011 ◽

Vol 8 (2) ◽

pp. 2739-2782 ◽

Cited By ~ 4

Author(s):

D. Brochero ◽

F. Anctil ◽

C. Gagné

Keyword(s):

Computational Time ◽

Ensemble Prediction ◽

Prediction System ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Weather Forecasts ◽

Criterion Score ◽

Prediction Systems ◽

The Relationship ◽

Selection Of

Abstract. Hydrological Ensemble Prediction System (HEPS), obtained by forcing rainfall-runoff models with Meteorological Ensemble Prediction Systems (MEPS), have been recognized as useful approaches to quantify uncertainties of hydrological forecasting systems. This task is complex both in terms of the coupling of information and computational time, which may create an operational barrier. The main objective of the current work is to assess the degree of simplification (reduction of members) of a HEPS configured with 16 lumped hydrological models driven by the 50 weather ensemble forecasts from the European Center for Medium-range Weather Forecasts (ECMWF). Here, the selection of the most relevant members is proposed using a Backward greedy technique with k-fold cross-validation, allowing an optimal use of the information. The methodology draws from a multi-criterion score that represents the combination of resolution, reliability, consistency, and diversity. Results show that the degree of reduction of members can be established in terms of maximum number of members required (complexity of the HEPS) or the maximization of the relationship between the different scores (performance).

Download Full-text

Test of a Poor Man’s Ensemble Prediction System for Short-Range Probability Forecasting

Monthly Weather Review ◽

10.1175/mwr2911.1 ◽

2005 ◽

Vol 133 (7) ◽

pp. 1825-1839 ◽

Cited By ~ 36

Author(s):

A. Arribas ◽

K. B. Robertson ◽

K. R. Mylne

Keyword(s):

Short Range ◽

Weather Prediction ◽

Ensemble Prediction ◽

Small Subset ◽

Prediction System ◽

Ensemble Forecasts ◽

Sea Level Pressure ◽

Ensemble Prediction System ◽

To Come ◽

Prediction Systems

Abstract Current operational ensemble prediction systems (EPSs) are designed specifically for medium-range forecasting, but there is also considerable interest in predictability in the short range, particularly for potential severe-weather developments. A possible option is to use a poor man’s ensemble prediction system (PEPS) comprising output from different numerical weather prediction (NWP) centers. By making use of a range of different models and independent analyses, a PEPS provides essentially a random sampling of both the initial condition and model evolution errors. In this paper the authors investigate the ability of a PEPS using up to 14 models from nine operational NWP centers. The ensemble forecasts are verified for a 101-day period and five variables: mean sea level pressure, 500-hPa geopotential height, temperature at 850 hPa, 2-m temperature, and 10-m wind speed. Results are compared with the operational ECMWF EPS, using the ECMWF analysis as the verifying “truth.” It is shown that, despite its smaller size, PEPS is an efficient way of producing ensemble forecasts and can provide competitive performance in the short range. The best relative performance is found to come from hybrid configurations combining output from a small subset of the ECMWF EPS with other different NWP models.

Download Full-text

On the Impact of Short-Range Meteorological Forecasts for Ensemble Streamflow Predictions

Journal of Hydrometeorology ◽

10.1175/2008jhm959.1 ◽

2008 ◽

Vol 9 (6) ◽

pp. 1301-1317 ◽

Cited By ~ 53

Author(s):

Guillaume Thirel ◽

Fabienne Rousset-Regimbeau ◽

Eric Martin ◽

Florence Habets

Keyword(s):

Short Range ◽

Ensemble Prediction ◽

Prediction System ◽

Streamflow Prediction ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Ensemble Streamflow Prediction ◽

Prediction Systems ◽

Set Up ◽

The Impact

Abstract Ensemble streamflow prediction systems are emerging in the international scientific community in order to better assess hydrologic threats. Two ensemble streamflow prediction systems (ESPSs) were set up at Météo-France using ensemble forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) Ensemble Prediction System for the first one, and from the Prévision d’Ensemble Action de Recherche Petite Echelle Grande Echelle (PEARP) ensemble prediction system of Météo-France for the second. This paper presents the evaluation of their capacities to better anticipate severe hydrological events and more generally to estimate the quality of both ESPSs on their globality. The two ensemble predictions were used as input for the same hydrometeorological model. The skills of both ensemble streamflow prediction systems were evaluated over all of France for the precipitation input and streamflow prediction during a 569-day period and for a 2-day short-range scale. The ensemble streamflow prediction system based on the PEARP data was the best for floods and small basins, and the ensemble streamflow prediction system based on the ECMWF data seemed the best adapted for low flows and large basins.

Download Full-text

Decomposition of a New Proper Score for Verification of Ensemble Forecasts

Monthly Weather Review ◽

10.1175/mwr-d-14-00150.1 ◽

2015 ◽

Vol 143 (5) ◽

pp. 1517-1532 ◽

Cited By ~ 3

Author(s):

H. M. Christensen

Keyword(s):

Standard Deviation ◽

Ensemble Forecast ◽

Ensemble Prediction ◽

Brier Score ◽

Continuous Variables ◽

Ensemble Forecasts ◽

Atmospheric Flow ◽

Ensemble Prediction System ◽

Weather Forecasts ◽

Medium Range

Abstract A new proper score, the error-spread score (ES), has recently been proposed for evaluation of ensemble forecasts of continuous variables. The ES is formulated with respect to the moments of the ensemble forecast. It is particularly sensitive to evaluating how well an ensemble forecast represents uncertainty: is the probabilistic forecast well calibrated? In this paper, it is shown that the ES can be decomposed into its reliability, resolution, and uncertainty components in a similar way to the Brier score. The first term evaluates the reliability of the forecast standard deviation and skewness, rewarding systems where the forecast moments reliably indicate the properties of the verification. The second term evaluates the resolution of the forecast standard deviation and skewness, and rewards systems where the forecast moments vary from the climatological moments according to the predictability of the atmospheric flow. The uncertainty term depends only on the observed error distribution and is independent of the forecast standard deviation or skewness. The decomposition was demonstrated using forecasts made with the European Centre for Medium-Range Weather Forecasts ensemble prediction system, and was able to identify the source of the skill in the forecasts at different latitudes.

Download Full-text

A Comparison of the ECMWF, MSC, and NCEP Global Ensemble Prediction Systems

Monthly Weather Review ◽

10.1175/mwr2905.1 ◽

2005 ◽

Vol 133 (5) ◽

pp. 1076-1097 ◽

Cited By ~ 406

Author(s):

Roberto Buizza ◽

P. L. Houtekamer ◽

Gerald Pellerin ◽

Zoltan Toth ◽

Yuejian Zhu ◽

...

Keyword(s):

Ensemble Forecasting ◽

Ensemble Prediction ◽

Forecast Errors ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Weather Forecasts ◽

Prediction Systems ◽

Environmental Prediction ◽

Global Systems

Abstract The present paper summarizes the methodologies used at the European Centre for Medium-Range Weather Forecasts (ECMWF), the Meteorological Service of Canada (MSC), and the National Centers for Environmental Prediction (NCEP) to simulate the effect of initial and model uncertainties in ensemble forecasting. The characteristics of the three systems are compared for a 3-month period between May and July 2002. The main conclusions of the study are the following:the performance of ensemble prediction systems strongly depends on the quality of the data assimilation system used to create the unperturbed (best) initial condition and the numerical model used to generate the forecasts;a successful ensemble prediction system should simulate the effect of both initial and model-related uncertainties on forecast errors; andfor all three global systems, the spread of ensemble forecasts is insufficient to systematically capture reality, suggesting that none of them is able to simulate all sources of forecast uncertainty.The relative strengths and weaknesses of the three systems identified in this study can offer guidelines for the future development of ensemble forecasting techniques.

Download Full-text

Simplifying a hydrological ensemble prediction system with a backward greedy selection of members – Part 1: Optimization criteria

Hydrology and Earth System Sciences ◽

10.5194/hess-15-3307-2011 ◽

2011 ◽

Vol 15 (11) ◽

pp. 3307-3325 ◽

Cited By ~ 11

Author(s):

D. Brochero ◽

F. Anctil ◽

C. Gagné

Keyword(s):

Computational Time ◽

Ensemble Prediction ◽

Hydrological Models ◽

Rainfall Runoff ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Weather Forecasts ◽

Prediction Systems ◽

The Relationship ◽

Selection Of

Abstract. Hydrological Ensemble Prediction Systems (HEPS), obtained by forcing rainfall-runoff models with Meteorological Ensemble Prediction Systems (MEPS), have been recognized as useful approaches to quantify uncertainties of hydrological forecasting systems. This task is complex both in terms of the coupling of information and computational time, which may create an operational barrier. The main objective of the current work is to assess the degree of simplification (reduction of the number of hydrological members) that can be achieved with a HEPS configured using 16 lumped hydrological models driven by the 50 weather ensemble forecasts from the European Centre for Medium-range Weather Forecasts (ECMWF). Here, Backward Greedy Selection (BGS) is proposed to assess the weight that each model must represent within a subset that offers similar or better performance than a reference set of 800 hydrological members. These hydrological models' weights represent the participation of each hydrological model within a simplified HEPS which would issue real-time forecasts in a relatively short computational time. The methodology uses a variation of the k-fold cross-validation, allowing an optimal use of the information, and employs a multi-criterion framework that represents the combination of resolution, reliability, consistency, and diversity. Results show that the degree of reduction of members can be established in terms of maximum number of members required (complexity of the HEPS) or the maximization of the relationship between the different scores (performance).

Download Full-text

Initial perturbations based on Ensemble Transfrom Kalman Filter with rescaling method for ensemble forecast

Weather and Forecasting ◽

10.1175/waf-d-20-0176.1 ◽

2021 ◽

Author(s):

Jingzhuo Wang ◽

Jing Chen ◽

Hanbin Zhang ◽

Hua Tian ◽

Yining Shi

Keyword(s):

Growth Rate ◽

Kalman Filter ◽

Weather Forecasting ◽

Initial Perturbation ◽

Ensemble Forecast ◽

Ensemble Prediction ◽

Forecast Errors ◽

Model Uncertainties ◽

Ensemble Prediction System ◽

Probabilistic Forecasts

AbstractEnsemble forecast is a method to faithfully describe initial and model uncertainties in a weather forecasting system. Initial uncertainties are much more important than model uncertainties in the short-range numerical prediction. Currently, initial uncertainties are described by Ensemble Transform Kalman Filter (ETKF) initial perturbation method in Global and Regional Assimilation and Prediction Enhanced System-Regional Ensemble Prediction System (GRAPES-REPS). However, an initial perturbation distribution similar to the analysis error cannot be yielded in the ETKF method of the GRAPES-REPS. To improve the method, we introduce a regional rescaling factor into the ETKF method (we call it ETKF_R). We also compare the results between the ETKF and ETKF_R methods and further demonstrate how rescaling can affect the initial perturbation characteristics as well as the ensemble forecast skills. The characteristics of the initial ensemble perturbation improve after applying the ETKF_R method. For example, the initial perturbation structures become more reasonable, the perturbations are better able to explain the forecast errors at short lead times, and the lower kinetic energy spectrum as well as perturbation energy at the initial forecast times can lead to a higher growth rate of themselves. Additionally, the ensemble forecast verification results suggest that the ETKF_R method has a better spread-skill relationship, a faster ensemble spread growth rate and a more reasonable rank histogram distribution than ETKF. Furthermore, the rescaling has only a minor impact on the assessment of the sharpness of probabilistic forecasts. The above results all suggest that ETKF_R can be effectively applied to the operational GRAPES-REPS.

Download Full-text

The skill assessment of ENSO prediction issued by JMA ensemble prediction system and CFSv2

IOP Conference Series Earth and Environmental Science ◽

10.1088/1755-1315/893/1/012047 ◽

2021 ◽

Vol 893 (1) ◽

pp. 012047

Author(s):

R Rahmat ◽

A M Setiawan ◽

Supari

Keyword(s):

Southern Oscillation ◽

Skill Score ◽

Skill Assessment ◽

Japan Meteorological Agency ◽

Ensemble Prediction ◽

In Situ Observation ◽

Prediction System ◽

Ensemble Prediction System ◽

Enso Prediction ◽

The Government

Abstract Indonesian climate is strongly affected by El Niño-Southern Oscillation (ENSO) as one of climate-driven factor. ENSO prediction during the upcoming months or year is crucial for the government in order to design the further strategic policy. Besides producing its own ENSO prediction, BMKG also regularly releases the status and ENSO prediction collected from other climate centers, such as Japan Meteorological Agency (JMA) and National Oceanic and Atmospheric Administration (NOAA). However, the skill of these products is not well known yet. The aim of this study is to conduct a simple assessment on the skill of JMA Ensemble Prediction System (EPS) and NOAA Climate Forecast System version 2 (CFSv2) ENSO prediction using World Meteorological Organization (WMO) Standard Verification System for Long Range Forecast (SVS-LRF) method. Both ENSO prediction results also compared each other using Student's t-test. The ENSO predictions data were obtained from the ENSO JMA and ENSO NCEP forecast archive files, while observed Nino 3.4 were calculated from Centennial in situ Observation-Based Estimates (COBE) Sea Surface Temperature Anomaly (SSTA). Both ENSO prediction issued by JMA and NCEP has a good skill on 1 to 3 months lead time, indicated by high correlation coefficient and positive value of Mean Square Skill Score (MSSS). However, the skill of both skills significantly reduced for May-August target month. Further careful interpretation is needed for ENSO prediction issued on this mentioned period.

Download Full-text

Forest-Based and Semiparametric Methods for the Postprocessing of Rainfall Ensemble Forecasting

Weather and Forecasting ◽

10.1175/waf-d-18-0149.1 ◽

2019 ◽

Vol 34 (3) ◽

pp. 617-634 ◽

Cited By ~ 8

Author(s):

Maxime Taillardat ◽

Anne-Laure Fougères ◽

Philippe Naveau ◽

Olivier Mestre

Keyword(s):

Heavy Rainfall ◽

Hybrid Methods ◽

Ensemble Prediction ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Wide Range ◽

Selection Step ◽

Heavy Tailed ◽

Ensemble Model Output Statistics ◽

Model Output Statistics

Abstract To satisfy a wide range of end users, rainfall ensemble forecasts have to be skillful for both low precipitation and extreme events. We introduce local statistical postprocessing methods based on quantile regression forests and gradient forests with a semiparametric extension for heavy-tailed distributions. These hybrid methods make use of the forest-based outputs to fit a parametric distribution that is suitable to model jointly low, medium, and heavy rainfall intensities. Our goal is to improve ensemble quality and value for all rainfall intensities. The proposed methods are applied to daily 51-h forecasts of 6-h accumulated precipitation from 2012 to 2015 over France using the Météo-France ensemble prediction system called Prévision d’Ensemble ARPEGE (PEARP). They are verified with a cross-validation strategy and compete favorably with state-of-the-art methods like analog ensemble or ensemble model output statistics. Our methods do not assume any parametric links between the variables to calibrate and possible covariates. They do not require any variable selection step and can make use of more than 60 predictors available such as summary statistics on the raw ensemble, deterministic forecasts of other parameters of interest, or probabilities of convective rainfall. In addition to improvements in overall performance, hybrid forest-based procedures produced the largest skill improvements for forecasting heavy rainfall events.

Download Full-text

PostProcessing and Visualization Techniques for Convection-Allowing Ensembles

Bulletin of the American Meteorological Society ◽

10.1175/bams-d-18-0041.1 ◽

2019 ◽

Vol 100 (7) ◽

pp. 1245-1258 ◽

Cited By ~ 11

Author(s):

Brett Roberts ◽

Israel L. Jirak ◽

Adam J. Clark ◽

Steven J. Weiss ◽

John S. Kain

Keyword(s):

Real Time ◽

Data Extraction ◽

Deep Convection ◽

Weather Prediction ◽

Ensemble Forecast ◽

Ensemble Prediction ◽

Data Volume ◽

Explicit Simulation ◽

Prediction Systems ◽

Visualization Techniques

AbstractSince the early 2000s, growing computing resources for numerical weather prediction (NWP) and scientific advances enabled development and testing of experimental, real-time deterministic convection-allowing models (CAMs). By the late 2000s, continued advancements spurred development of CAM ensemble forecast systems, through which a broad range of successful forecasting applications have been demonstrated. This work has prepared the National Weather Service (NWS) for practical usage of the High Resolution Ensemble Forecast (HREF) system, which was implemented operationally in November 2017. Historically, methods for postprocessing and visualizing products from regional and global ensemble prediction systems (e.g., ensemble means and spaghetti plots) have been applied to fields that provide information on mesoscale to synoptic-scale processes. However, much of the value from CAMs is derived from the explicit simulation of deep convection and associated storm-attribute fields like updraft helicity and simulated reflectivity. Thus, fully exploiting CAM ensembles for forecasting applications has required the development of fundamentally new data extraction, postprocessing, and visualization strategies. In the process, challenges imposed by the immense data volume inherent to these systems required new approaches when considering diverse factors like forecaster interpretation and computational expense. In this article, we review the current state of postprocessing and visualization for CAM ensembles, with a particular focus on forecast applications for severe convective hazards that have been evaluated within NOAA’s Hazardous Weather Testbed. The HREF web viewer implemented at the NWS Storm Prediction Center (SPC) is presented as a prototype for deploying these techniques in real time on a flexible and widely accessible platform.

Download Full-text