scholarly journals Beyond univariate calibration: verifying spatial structure in ensembles of forecast fields

2020 ◽  
Vol 27 (3) ◽  
pp. 411-427
Author(s):  
Josh Jacobson ◽  
William Kleiber ◽  
Michael Scheuerer ◽  
Joseph Bellier

Abstract. Most available verification metrics for ensemble forecasts focus on univariate quantities. That is, they assess whether the ensemble provides an adequate representation of the forecast uncertainty about the quantity of interest at a particular location and time. For spatially indexed ensemble forecasts, however, it is also important that forecast fields reproduce the spatial structure of the observed field and represent the uncertainty about spatial properties such as the size of the area for which heavy precipitation, high winds, critical fire weather conditions, etc., are expected. In this article we study the properties of the fraction of threshold exceedance (FTE) histogram, a new diagnostic tool designed for spatially indexed ensemble forecast fields. Defined as the fraction of grid points where a prescribed threshold is exceeded, the FTE is calculated for the verification field and separately for each ensemble member. It yields a projection of a – possibly high-dimensional – multivariate quantity onto a univariate quantity that can be studied with standard tools like verification rank histograms. This projection is appealing since it reflects a spatial property that is intuitive and directly relevant in applications, though it is not obvious whether the FTE is sufficiently sensitive to misrepresentation of spatial structure in the ensemble. In a comprehensive simulation study we find that departures from uniformity of the FTE histograms can indeed be related to forecast ensembles with biased spatial variability and that these histograms detect shortcomings in the spatial structure of ensemble forecast fields that are not obvious by eye. For demonstration, FTE histograms are applied in the context of spatially downscaled ensemble precipitation forecast fields from NOAA's Global Ensemble Forecast System.

2020 ◽  
Author(s):  
Joshuah Jacobson ◽  
William Kleiber ◽  
Michael Scheuerer ◽  
Joseph Bellier

Abstract. Most available verification metrics for ensemble forecasts focus on univariate quantities. That is, they assess whether the ensemble provides an adequate representation of the forecast uncertainty about the quantity of interest at a particular location and time. For spatially-indexed ensemble forecasts, however, it is also important that forecast fields reproduce the spatial structure of the observed field, and represent the uncertainty about spatial properties such as the size of the area for which heavy precipitation, high winds, critical fire weather conditions, etc. are expected. In this article we study the properties of a new diagnostic tool designed for spatially-indexed ensemble forecast fields. The metric is based on a level-crossing statistic that we term the fraction of threshold exceedance (FTE), and is calculated for the verification field, and separately for each ensemble member. The FTE yields a projection of a – possibly high-dimensional – multivariate quantity onto a univariate quantity that can be studied with standard tools like verification rank histograms. This projection is appealing since it reflects a spatial property that is intuitive and directly relevant in applications, though it is not obvious whether the FTE is sufficiently sensitive to misrepresentation of spatial structure in the ensemble. In a comprehensive simulation study we find that departures from uniformity of these so called FTE histograms can be indeed be related to forecast ensembles with biased spatial variability, and that these histograms detect shortcomings in the spatial structure of ensemble forecast fields that are not obvious by eye. For demonstration, FTE histograms are applied in the context of spatially downscaled ensemble precipitation forecast fields from NOAA's Global Ensemble Forecast System.


2012 ◽  
Vol 8 (1) ◽  
pp. 53-57
Author(s):  
S. Siegert ◽  
J. Bröcker ◽  
H. Kantz

Abstract. In numerical weather prediction, ensembles are used to retrieve probabilistic forecasts of future weather conditions. We consider events where the verification is smaller than the smallest, or larger than the largest ensemble member of a scalar ensemble forecast. These events are called outliers. In a statistically consistent K-member ensemble, outliers should occur with a base rate of 2/(K+1). In operational ensembles this base rate tends to be higher. We study the predictability of outlier events in terms of the Brier Skill Score and find that forecast probabilities can be calculated which are more skillful than the unconditional base rate. This is shown analytically for statistically consistent ensembles. Using logistic regression, forecast probabilities for outlier events in an operational ensemble are calculated. These probabilities exhibit positive skill which is quantitatively similar to the analytical results. Possible causes of these results as well as their consequences for ensemble interpretation are discussed.


2018 ◽  
Vol 146 (12) ◽  
pp. 4079-4098 ◽  
Author(s):  
Thomas M. Hamill ◽  
Michael Scheuerer

Abstract Hamill et al. described a multimodel ensemble precipitation postprocessing algorithm that is used operationally by the U.S. National Weather Service (NWS). This article describes further changes that produce improved, reliable, and skillful probabilistic quantitative precipitation forecasts (PQPFs) for single or multimodel prediction systems. For multimodel systems, final probabilities are produced through the linear combination of PQPFs from the constituent models. The new methodology is applied to each prediction system. Prior to adjustment of the forecasts, parametric cumulative distribution functions (CDFs) of model and analyzed climatologies are generated using the previous 60 days’ forecasts and analyses and supplemental locations. The CDFs, which can be stored with minimal disk space, are then used for quantile mapping to correct state-dependent bias for each member. In this stage, the ensemble is also enlarged using a stencil of forecast values from the 5 × 5 surrounding grid points. Different weights and dressing distributions are assigned to the sorted, quantile-mapped members, with generally larger weights for outlying members and broader dressing distributions for members with heavier precipitation. Probability distributions are generated from the weighted sum of the dressing distributions. The NWS Global Ensemble Forecast System (GEFS), the Canadian Meteorological Centre (CMC) global ensemble, and the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble forecast data are postprocessed for April–June 2016. Single prediction system postprocessed forecasts are generally reliable and skillful. Multimodel PQPFs are roughly as skillful as the ECMWF system alone. Postprocessed guidance was generally more skillful than guidance using the Gamma distribution approach of Scheuerer and Hamill, with coefficients generated from data pooled across the United States.


2005 ◽  
Vol 20 (4) ◽  
pp. 609-626 ◽  
Author(s):  
Matthew S. Wandishin ◽  
Michael E. Baldwin ◽  
Steven L. Mullen ◽  
John V. Cortinas

Abstract Short-range ensemble forecasting is extended to a critical winter weather problem: forecasting precipitation type. Forecast soundings from the operational NCEP Short-Range Ensemble Forecast system are combined with five precipitation-type algorithms to produce probabilistic forecasts from January through March 2002. Thus the ensemble combines model diversity, initial condition diversity, and postprocessing algorithm diversity. All verification numbers are conditioned on both the ensemble and observations recording some form of precipitation. This separates the forecast of type from the yes–no precipitation forecast. The ensemble is very skillful in forecasting rain and snow but it is only moderately skillful for freezing rain and unskillful for ice pellets. However, even for the unskillful forecasts the ensemble shows some ability to discriminate between the different precipitation types and thus provides some positive value to forecast users. Algorithm diversity is shown to be as important as initial condition diversity in terms of forecast quality, although neither has as big an impact as model diversity. The algorithms have their individual strengths and weaknesses, but no algorithm is clearly better or worse than the others overall.


2020 ◽  
Author(s):  
Martin Widmann ◽  
Michael Angus ◽  
Andrew Orr ◽  
Gregor Leckebusch

<p>It is estimated that around 10% of India’s population (or 130 million people) are acutely exposed to flooding resulting from intense rainfall, particularly during the main monsoon season (June to September). Such severe weather and accompanying flooding can result in considerable disruption to human communities and individuals by causing loss of life, damage to property, loss of livestock, destruction of crops and agricultural land, and deterioration of health conditions owing to waterborne diseases. To provide early warning of these heavy rainfall events, reliable impact-focused forecasting from operational weather forecasting centres such as NCMRWF (National Centre for Medium Range Weather Forecasting) and IMD (Indian Meteorological Department) is crucial.</p><p>Yet, despite the advances in numerical weather predictions and the state-of-the-art models used in India, accurately forecasting extreme weather at these scales is still difficult, and the intensity and spatial structure of predicted precipitation can both exhibit large errors. These errors are mainly caused by the limited resolution of weather forecasting models, and the resulting lack of adequate representation of small-scale processes. Some of them can be substantially reduced by statistical postprocessing of the forecasts taking into account past observations. However, currently no postprocessing methods for precipitation are applied to the weather forecasts over India.</p><p>Here, we present first results of postprocessing precipitation ensemble forecasts for India with local Quantile Mapping. Given our focus on heavy precipitation and the associated problem of a low number of cases and high sampling variability for the simulated and observed empirical Probability Density Functions (PDFs), we employ both standard, non-parametric PDFs but also parametric PDFs based on the Gamma and Generalised Extreme Value distributions.</p><p>This work is part of the ‘Weather and Climate Science for Service Partnership India’ (WCSSP-India) project ‘Heavy Precipitation forecast Postprocessing over India (HEPPI)’. Quantile Mapping is a member-by-member postprocessing method that essentially retains the spatial structure of the raw simulation. Within HEPPI we will also test ensemble-based methods and methods that adjust the spatial structure. This work provides the basis for further integration of meteorological and hydrological predictions.</p>


2013 ◽  
Vol 20 (5) ◽  
pp. 759-770 ◽  
Author(s):  
T. Haszpra ◽  
I. Lagzi ◽  
T. Tél

Abstract. The dispersion of aerosol particle pollutants is studied using 50 members of an ensemble forecast in the example of a hypothetical free atmospheric emission above Fukushima over a period of 2.5 days. Considerable differences are found among the dispersion predictions of the different ensemble members, as well as between the ensemble mean and the deterministic result at the end of the observation period. The variance is found to decrease with the particle size. The geographical area where a threshold concentration is exceeded in at least one ensemble member expands to a 5–10 times larger region than the area from the deterministic forecast, both for air column "concentration" and in the "deposition" field. We demonstrate that the root-mean-square distance of any particle from its own clones in the ensemble members can reach values on the order of one thousand kilometers. Even the centers of mass of the particle cloud of the ensemble members deviate considerably from that obtained by the deterministic forecast. All these indicate that an investigation of the dispersion of aerosol particles in the spirit of ensemble forecast contains useful hints for the improvement of risk assessment.


2017 ◽  
Vol 32 (4) ◽  
pp. 1569-1583 ◽  
Author(s):  
Adam J. Clark

Abstract Methods for generating ensemble mean precipitation forecasts from convection-allowing model (CAM) ensembles based on a simple average of all members at each grid point can have limited utility because of amplitude reduction and overprediction of light precipitation areas caused by averaging complex spatial fields with strong gradients and high-amplitude features. To combat these issues with the simple ensemble mean, a method known as probability matching is commonly used to replace the ensemble mean amounts with amounts sampled from the distribution of ensemble member forecasts, which results in a field that has a bias approximately equal to the average bias of the ensemble members. Thus, the probability matched mean (PM mean hereafter) is viewed as a better representation of the ensemble members compared to the mean, and previous studies find that it is more skillful than any of the individual members. Herein, using nearly a year’s worth of data from a CAM-based ensemble running in real time at the National Severe Storms Laboratory, evidence is provided that the superior performance of the PM mean is at least partially an artifact of the spatial redistribution of precipitation amounts that occur when the PM mean is computed over a large domain. Specifically, the PM mean enlarges big areas of heavy precipitation and shrinks or even eliminates smaller ones. An alternative approach for the PM mean is developed that restricts the grid points used to those within a specified radius of influence. The new approach has an improved spatial representation of precipitation and is found to perform more skillfully than the PM mean at large scales when using neighborhood-based verification metrics.


2011 ◽  
Vol 26 (3) ◽  
pp. 319-336 ◽  
Author(s):  
Christopher J. Schaffer ◽  
William A. Gallus ◽  
Moti Segal

Abstract Four new approaches of postprocessing quantitative precipitation forecasts (QPFs) from model ensemble output were used to generate probability of precipitation (POP) tables in order to develop a forecasting method that could outperform a traditional method that relies upon calibration of POP forecasts derived using equal weighting of ensemble members. Early warm season 10-member ensemble output from the NOAA Hazardous Weather Testbed Spring Experiments was used, with 29 cases serving as a training set to create the POP tables and 20 cases used as a test set. The new approaches use QPF–POP relationships based on two properties termed precipitation amount characteristic (PAC) and ensemble member agreement. Exploratory results are presented for 20-km grid spacing and selectively for 4-km grid spacing. In the first approach, POPs were based on a binned PAC and the number of ensemble members with 6-h precipitation accumulations greater than given thresholds. In a second approach, a neighborhood method was used to find the number of points in a given neighborhood area around each of the domain grid points with precipitation amounts greater than a given threshold, while also considering the binned PAC representative of the neighborhood. A third approach synthesized the previous methods and led to an increase in skill relative to the individual methods, and a fourth approach using a combination of methods produced forecasts with even greater skill. All of the forecasts from the four approaches were improved statistically significantly compared to the calibrated traditional method’s forecasts at 20-km grid spacing. The second approach on its own showed skill comparable to that obtained by a traditional calibrated 10-member ensemble, so adopting this approach alone could potentially save computer resources that could then be used for model refinements, only sacrificing the increased skill that could have been obtained by using the fourth approach.


2018 ◽  
Vol 19 (10) ◽  
pp. 1689-1706 ◽  
Author(s):  
Thomas E. Adams III ◽  
Randel Dymond

Abstract This study presents findings from a real-time forecast experiment that compares legacy deterministic hydrologic stage forecasts to ensemble mean and median stage forecasts from the NOAA/NWS Meteorological Model-Based Ensemble Forecast System (MMEFS). The NOAA/NWS Ohio River Forecast Center (OHRFC) area of responsibility defines the experimental region. Real-time forecasts from subbasins at 54 forecast point locations, ranging in drainage area, geographic location within the Ohio River valley, and watershed response time serve as the basis for analyses. In the experiment, operational hydrologic forecasts, with a 24-h quantitative precipitation forecast (QPF) and forecast temperatures, are compared to MMEFS hydrologic ensemble mean and median forecasts, with model forcings from the NOAA/NWS National Centers for Environmental Prediction (NCEP) North American Ensemble Forecast System (NAEFS), over the period from 30 November 2010 through 24 May 2012. Experiments indicate that MMEFS ensemble mean and median forecasts exhibit lower errors beginning at about lead time 90 h when forecasts at all locations are aggregated. With fast response basins that peak at ≤24 h, ensemble mean and median forecasts exhibit lower errors much earlier, beginning at about lead time 36 h, which suggests the viability of using MMEFS ensemble forecasts as an alternative to OHRFC legacy forecasts. Analyses show that ensemble median forecasts generally exhibit smaller errors than ensemble mean forecasts for all stage ranges. Verification results suggest that OHRFC MMEFS NAEFS ensemble forecasts are reasonable, but needed improvements are identified.


2014 ◽  
Vol 142 (4) ◽  
pp. 1542-1555 ◽  
Author(s):  
Jianguo Liu ◽  
Zhenghui Xie

Abstract Bayesian model averaging (BMA) probability quantitative precipitation forecast (PQPF) models were established by calibrating their parameters using 1–7-day ensemble forecasts of 24-h accumulated precipitation, and observations from 43 meteorological stations in the Huaihe Basin. Forecasts were provided by four single-center (model) ensemble prediction systems (EPSs) and their multicenter (model) grand ensemble systems, which consider exchangeable members (EGE) in The Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble (TIGGE). The four single-center EPSs were from the China Meteorological Administration (CMA), the European Centre for Medium-Range Weather Forecasts (ECMWF), the National Centers for Environment Prediction (NCEP), and the Met Office (UKMO). Comparisons between the raw ensemble, logistic regression, and BMA for PQPFs suggested that the BMA predictive models performed better than the raw ensemble forecasts and logistic regression. The verification and comparison of five BMA EPSs for PQPFs in the study area showed that the UKMO and ECMWF were a little superior to the NCEP and CMA in general for lead times of 1–7 days for the single-center EPSs. The BMA model for EGE outperformed those for single-center EPSs for all 1–7-day ensemble forecasts, and mostly improved the quality of PQPF. Based on the percentile forecasts from the BMA predictive PDFs for EGE, a heavy-precipitation warning scheme is proposed for the test area.


Sign in / Sign up

Export Citation Format

Share Document