scholarly journals Beyond Univariate Calibration: Verifying Spatial Structure in Ensembles of Forecast Fields

2020 ◽  
Author(s):  
Joshuah Jacobson ◽  
William Kleiber ◽  
Michael Scheuerer ◽  
Joseph Bellier

Abstract. Most available verification metrics for ensemble forecasts focus on univariate quantities. That is, they assess whether the ensemble provides an adequate representation of the forecast uncertainty about the quantity of interest at a particular location and time. For spatially-indexed ensemble forecasts, however, it is also important that forecast fields reproduce the spatial structure of the observed field, and represent the uncertainty about spatial properties such as the size of the area for which heavy precipitation, high winds, critical fire weather conditions, etc. are expected. In this article we study the properties of a new diagnostic tool designed for spatially-indexed ensemble forecast fields. The metric is based on a level-crossing statistic that we term the fraction of threshold exceedance (FTE), and is calculated for the verification field, and separately for each ensemble member. The FTE yields a projection of a – possibly high-dimensional – multivariate quantity onto a univariate quantity that can be studied with standard tools like verification rank histograms. This projection is appealing since it reflects a spatial property that is intuitive and directly relevant in applications, though it is not obvious whether the FTE is sufficiently sensitive to misrepresentation of spatial structure in the ensemble. In a comprehensive simulation study we find that departures from uniformity of these so called FTE histograms can be indeed be related to forecast ensembles with biased spatial variability, and that these histograms detect shortcomings in the spatial structure of ensemble forecast fields that are not obvious by eye. For demonstration, FTE histograms are applied in the context of spatially downscaled ensemble precipitation forecast fields from NOAA's Global Ensemble Forecast System.

2020 ◽  
Vol 27 (3) ◽  
pp. 411-427
Author(s):  
Josh Jacobson ◽  
William Kleiber ◽  
Michael Scheuerer ◽  
Joseph Bellier

Abstract. Most available verification metrics for ensemble forecasts focus on univariate quantities. That is, they assess whether the ensemble provides an adequate representation of the forecast uncertainty about the quantity of interest at a particular location and time. For spatially indexed ensemble forecasts, however, it is also important that forecast fields reproduce the spatial structure of the observed field and represent the uncertainty about spatial properties such as the size of the area for which heavy precipitation, high winds, critical fire weather conditions, etc., are expected. In this article we study the properties of the fraction of threshold exceedance (FTE) histogram, a new diagnostic tool designed for spatially indexed ensemble forecast fields. Defined as the fraction of grid points where a prescribed threshold is exceeded, the FTE is calculated for the verification field and separately for each ensemble member. It yields a projection of a – possibly high-dimensional – multivariate quantity onto a univariate quantity that can be studied with standard tools like verification rank histograms. This projection is appealing since it reflects a spatial property that is intuitive and directly relevant in applications, though it is not obvious whether the FTE is sufficiently sensitive to misrepresentation of spatial structure in the ensemble. In a comprehensive simulation study we find that departures from uniformity of the FTE histograms can indeed be related to forecast ensembles with biased spatial variability and that these histograms detect shortcomings in the spatial structure of ensemble forecast fields that are not obvious by eye. For demonstration, FTE histograms are applied in the context of spatially downscaled ensemble precipitation forecast fields from NOAA's Global Ensemble Forecast System.


2012 ◽  
Vol 8 (1) ◽  
pp. 53-57
Author(s):  
S. Siegert ◽  
J. Bröcker ◽  
H. Kantz

Abstract. In numerical weather prediction, ensembles are used to retrieve probabilistic forecasts of future weather conditions. We consider events where the verification is smaller than the smallest, or larger than the largest ensemble member of a scalar ensemble forecast. These events are called outliers. In a statistically consistent K-member ensemble, outliers should occur with a base rate of 2/(K+1). In operational ensembles this base rate tends to be higher. We study the predictability of outlier events in terms of the Brier Skill Score and find that forecast probabilities can be calculated which are more skillful than the unconditional base rate. This is shown analytically for statistically consistent ensembles. Using logistic regression, forecast probabilities for outlier events in an operational ensemble are calculated. These probabilities exhibit positive skill which is quantitatively similar to the analytical results. Possible causes of these results as well as their consequences for ensemble interpretation are discussed.


2005 ◽  
Vol 20 (4) ◽  
pp. 609-626 ◽  
Author(s):  
Matthew S. Wandishin ◽  
Michael E. Baldwin ◽  
Steven L. Mullen ◽  
John V. Cortinas

Abstract Short-range ensemble forecasting is extended to a critical winter weather problem: forecasting precipitation type. Forecast soundings from the operational NCEP Short-Range Ensemble Forecast system are combined with five precipitation-type algorithms to produce probabilistic forecasts from January through March 2002. Thus the ensemble combines model diversity, initial condition diversity, and postprocessing algorithm diversity. All verification numbers are conditioned on both the ensemble and observations recording some form of precipitation. This separates the forecast of type from the yes–no precipitation forecast. The ensemble is very skillful in forecasting rain and snow but it is only moderately skillful for freezing rain and unskillful for ice pellets. However, even for the unskillful forecasts the ensemble shows some ability to discriminate between the different precipitation types and thus provides some positive value to forecast users. Algorithm diversity is shown to be as important as initial condition diversity in terms of forecast quality, although neither has as big an impact as model diversity. The algorithms have their individual strengths and weaknesses, but no algorithm is clearly better or worse than the others overall.


2020 ◽  
Author(s):  
Martin Widmann ◽  
Michael Angus ◽  
Andrew Orr ◽  
Gregor Leckebusch

<p>It is estimated that around 10% of India’s population (or 130 million people) are acutely exposed to flooding resulting from intense rainfall, particularly during the main monsoon season (June to September). Such severe weather and accompanying flooding can result in considerable disruption to human communities and individuals by causing loss of life, damage to property, loss of livestock, destruction of crops and agricultural land, and deterioration of health conditions owing to waterborne diseases. To provide early warning of these heavy rainfall events, reliable impact-focused forecasting from operational weather forecasting centres such as NCMRWF (National Centre for Medium Range Weather Forecasting) and IMD (Indian Meteorological Department) is crucial.</p><p>Yet, despite the advances in numerical weather predictions and the state-of-the-art models used in India, accurately forecasting extreme weather at these scales is still difficult, and the intensity and spatial structure of predicted precipitation can both exhibit large errors. These errors are mainly caused by the limited resolution of weather forecasting models, and the resulting lack of adequate representation of small-scale processes. Some of them can be substantially reduced by statistical postprocessing of the forecasts taking into account past observations. However, currently no postprocessing methods for precipitation are applied to the weather forecasts over India.</p><p>Here, we present first results of postprocessing precipitation ensemble forecasts for India with local Quantile Mapping. Given our focus on heavy precipitation and the associated problem of a low number of cases and high sampling variability for the simulated and observed empirical Probability Density Functions (PDFs), we employ both standard, non-parametric PDFs but also parametric PDFs based on the Gamma and Generalised Extreme Value distributions.</p><p>This work is part of the ‘Weather and Climate Science for Service Partnership India’ (WCSSP-India) project ‘Heavy Precipitation forecast Postprocessing over India (HEPPI)’. Quantile Mapping is a member-by-member postprocessing method that essentially retains the spatial structure of the raw simulation. Within HEPPI we will also test ensemble-based methods and methods that adjust the spatial structure. This work provides the basis for further integration of meteorological and hydrological predictions.</p>


2013 ◽  
Vol 20 (5) ◽  
pp. 759-770 ◽  
Author(s):  
T. Haszpra ◽  
I. Lagzi ◽  
T. Tél

Abstract. The dispersion of aerosol particle pollutants is studied using 50 members of an ensemble forecast in the example of a hypothetical free atmospheric emission above Fukushima over a period of 2.5 days. Considerable differences are found among the dispersion predictions of the different ensemble members, as well as between the ensemble mean and the deterministic result at the end of the observation period. The variance is found to decrease with the particle size. The geographical area where a threshold concentration is exceeded in at least one ensemble member expands to a 5–10 times larger region than the area from the deterministic forecast, both for air column "concentration" and in the "deposition" field. We demonstrate that the root-mean-square distance of any particle from its own clones in the ensemble members can reach values on the order of one thousand kilometers. Even the centers of mass of the particle cloud of the ensemble members deviate considerably from that obtained by the deterministic forecast. All these indicate that an investigation of the dispersion of aerosol particles in the spirit of ensemble forecast contains useful hints for the improvement of risk assessment.


2018 ◽  
Vol 19 (10) ◽  
pp. 1689-1706 ◽  
Author(s):  
Thomas E. Adams III ◽  
Randel Dymond

Abstract This study presents findings from a real-time forecast experiment that compares legacy deterministic hydrologic stage forecasts to ensemble mean and median stage forecasts from the NOAA/NWS Meteorological Model-Based Ensemble Forecast System (MMEFS). The NOAA/NWS Ohio River Forecast Center (OHRFC) area of responsibility defines the experimental region. Real-time forecasts from subbasins at 54 forecast point locations, ranging in drainage area, geographic location within the Ohio River valley, and watershed response time serve as the basis for analyses. In the experiment, operational hydrologic forecasts, with a 24-h quantitative precipitation forecast (QPF) and forecast temperatures, are compared to MMEFS hydrologic ensemble mean and median forecasts, with model forcings from the NOAA/NWS National Centers for Environmental Prediction (NCEP) North American Ensemble Forecast System (NAEFS), over the period from 30 November 2010 through 24 May 2012. Experiments indicate that MMEFS ensemble mean and median forecasts exhibit lower errors beginning at about lead time 90 h when forecasts at all locations are aggregated. With fast response basins that peak at ≤24 h, ensemble mean and median forecasts exhibit lower errors much earlier, beginning at about lead time 36 h, which suggests the viability of using MMEFS ensemble forecasts as an alternative to OHRFC legacy forecasts. Analyses show that ensemble median forecasts generally exhibit smaller errors than ensemble mean forecasts for all stage ranges. Verification results suggest that OHRFC MMEFS NAEFS ensemble forecasts are reasonable, but needed improvements are identified.


2014 ◽  
Vol 142 (4) ◽  
pp. 1542-1555 ◽  
Author(s):  
Jianguo Liu ◽  
Zhenghui Xie

Abstract Bayesian model averaging (BMA) probability quantitative precipitation forecast (PQPF) models were established by calibrating their parameters using 1–7-day ensemble forecasts of 24-h accumulated precipitation, and observations from 43 meteorological stations in the Huaihe Basin. Forecasts were provided by four single-center (model) ensemble prediction systems (EPSs) and their multicenter (model) grand ensemble systems, which consider exchangeable members (EGE) in The Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble (TIGGE). The four single-center EPSs were from the China Meteorological Administration (CMA), the European Centre for Medium-Range Weather Forecasts (ECMWF), the National Centers for Environment Prediction (NCEP), and the Met Office (UKMO). Comparisons between the raw ensemble, logistic regression, and BMA for PQPFs suggested that the BMA predictive models performed better than the raw ensemble forecasts and logistic regression. The verification and comparison of five BMA EPSs for PQPFs in the study area showed that the UKMO and ECMWF were a little superior to the NCEP and CMA in general for lead times of 1–7 days for the single-center EPSs. The BMA model for EGE outperformed those for single-center EPSs for all 1–7-day ensemble forecasts, and mostly improved the quality of PQPF. Based on the percentile forecasts from the BMA predictive PDFs for EGE, a heavy-precipitation warning scheme is proposed for the test area.


2021 ◽  
Author(s):  
Marion Mittermaier ◽  
Seonaid Anderson ◽  
Ric Crocker ◽  
Steven Cole ◽  
Robert Moore ◽  
...  

<p>Forecasting the potential for flood-producing precipitation and any subsequent flooding is a challenging task; the process is highly non-linear and inherently uncertain. Acknowledging and accounting for the uncertainty in precipitation and flood forecasts has become increasingly important with the move to risk-based warning and guidance services which combine the likelihood of flooding with the potential impact on society and the environment.</p><p>A standard approach to accounting for uncertainty is to generate ensemble forecasts. Here the national Grid-to-Grid (G2G) model is coupled to a Best Medium Range (BMR) ensemble which consists of three models spanning different time horizons: an ensemble nowcast for the first 6h, which is blended with the short-range 2.2 km Met Office Global Regional Ensemble Prediction System (MOGREPS-UK) ensemble up to 36h and the ~20 km global MOGREPS-G up to day 6. The G2G model is driven by 15-minute accumulations on a 1 km grid.</p><p>16-months of precipitation and river flow ensemble forecasts have been processed to develop and assess a joint verification framework which can facilitate the evaluation of the end-to-end forecasting chain. Analysis concluded the following: (1) daily precipitation accumulations provide the best guidance in terms of rain volume for hydrological impacts. One reason may be because it removes the impact of timing errors at the sub-daily scale. However, sub-daily precipitation can be more closely related to river flow on an ensemble member-by-member basis. (2) Observation uncertainty is important. The same forecasts verified against three different observed precipitation sources (raingauge, radar or merged) can provide markedly different results and interpretations. G2G river flow performance can also be affected, when driven by these datasets rather than forecasts. (3) The change in precipitation-intensity with model is evident and has an impact on downstream modelling and verification. (4) The period used for ensemble verification should be at least two years. The 16-month test period was sufficient for generating enough precipitation threshold-exceedances for the 95th percentile: but insufficient for higher thresholds and for river flow thresholds above half the median annual maximum flood at sub-regional scales. (5) A new method of presenting Time-Window Probabilities (TWPs) has been developed for precipitation thresholds that are hydrologically relevant. Verification of these shows that probabilities are larger, and more reliable so that users can have greater confidence in them. (6) Overall precipitation forecast skill was far more uniform than for river-flow, primarily because the atmosphere is a continuum whilst catchments are finite and subject to external, non-atmospheric factors including antecedent moisture. (7) Though G2G can be sensitive to precipitation outliers, the precipitation ensemble is generally under-spread and spread does not appear to amplify or propagate to enhance the river flow ensemble spread, so spread is reduced rather than increased in the downstream application.</p>


2014 ◽  
Vol 142 (12) ◽  
pp. 4519-4541 ◽  
Author(s):  
Glen S. Romine ◽  
Craig S. Schwartz ◽  
Judith Berner ◽  
Kathryn R. Fossell ◽  
Chris Snyder ◽  
...  

Abstract Ensembles provide an opportunity to greatly improve short-term prediction of local weather hazards, yet generating reliable predictions remain a significant challenge. In particular, convection-permitting ensemble forecast systems (CPEFSs) have persistent problems with underdispersion. Representing initial and or lateral boundary condition uncertainty along with forecast model error provides a foundation for building a more dependable CPEFS, but the best practice for ensemble system design is not well established. Several configurations of CPEFSs are examined where ensemble forecasts are nested within a larger domain, drawing initial conditions from a downscaled, continuously cycled, ensemble data assimilation system that provides state-dependent initial condition uncertainty. The control ensemble forecast, with initial condition uncertainty only, is skillful but underdispersive. To improve the reliability of the ensemble forecasts, the control ensemble is supplemented with 1) perturbed lateral boundary conditions; or, model error representation using either 2) stochastic kinetic energy backscatter or 3) stochastically perturbed parameterization tendencies. Forecasts are evaluated against stage IV accumulated precipitation analyses and radiosonde observations. Perturbed ensemble forecasts are also compared to the control forecast to assess the relative impact from adding forecast perturbations. For precipitation forecasts, all perturbation approaches improve ensemble reliability relative to the control CPEFS. Deterministic ensemble member forecast skill, verified against radiosonde observations, decreases when forecast perturbations are added, while ensemble mean forecasts remain similarly skillful to the control.


2014 ◽  
Vol 29 (5) ◽  
pp. 1259-1265 ◽  
Author(s):  
David R. Novak ◽  
Keith F. Brill ◽  
Wallace A. Hogsett

Abstract An objective technique to determine forecast snowfall ranges consistent with the risk tolerance of users is demonstrated. The forecast snowfall ranges are based on percentiles from probability distribution functions that are assumed to be perfectly calibrated. A key feature of the technique is that the snowfall range varies dynamically, with the resultant ranges varying based on the spread of ensemble forecasts at a given forecast projection, for a particular case, for a particular location. Furthermore, this technique allows users to choose their risk tolerance, quantified in terms of the expected false alarm ratio for forecasts of snowfall range. The technique is applied to the 4–7 March 2013 snowstorm at two different locations (Chicago, Illinois, and Washington, D.C.) to illustrate its use in different locations with different forecast uncertainties. The snowfall range derived from the Weather Prediction Center Probabilistic Winter Precipitation Forecast suite is found to be statistically reliable for the day 1 forecast during the 2013/14 season, providing confidence in the practical applicability of the technique.


Sign in / Sign up

Export Citation Format

Share Document