scholarly journals On reliability analysis of multi-categorical forecasts

2008 ◽  
Vol 15 (4) ◽  
pp. 661-673 ◽  
Author(s):  
J. Bröcker

Abstract. Reliability analysis of probabilistic forecasts, in particular through the rank histogram or Talagrand diagram, is revisited. Two shortcomings are pointed out: Firstly, a uniform rank histogram is but a necessary condition for reliability. Secondly, if the forecast is assumed to be reliable, an indication is needed how far a histogram is expected to deviate from uniformity merely due to randomness. Concerning the first shortcoming, it is suggested that forecasts be grouped or stratified along suitable criteria, and that reliability is analyzed individually for each forecast stratum. A reliable forecast should have uniform histograms for all individual forecast strata, not only for all forecasts as a whole. As to the second shortcoming, instead of the observed frequencies, the probability of the observed frequency is plotted, providing and indication of the likelihood of the result under the hypothesis that the forecast is reliable. Furthermore, a Goodness-Of-Fit statistic is discussed which is essentially the reliability term of the Ignorance score. The discussed tools are applied to medium range forecasts for 2 m-temperature anomalies at several locations and lead times. The forecasts are stratified along the expected ranked probability score. Those forecasts which feature a high expected score turn out to be particularly unreliable.

2020 ◽  
Vol 148 (8) ◽  
pp. 3489-3506
Author(s):  
Michael Scheuerer ◽  
Matthew B. Switanek ◽  
Rochelle P. Worsnop ◽  
Thomas M. Hamill

Abstract Forecast skill of numerical weather prediction (NWP) models for precipitation accumulations over California is rather limited at subseasonal time scales, and the low signal-to-noise ratio makes it challenging to extract information that provides reliable probabilistic forecasts. A statistical postprocessing framework is proposed that uses an artificial neural network (ANN) to establish relationships between NWP ensemble forecast and gridded observed 7-day precipitation accumulations, and to model the increase or decrease of the probabilities for different precipitation categories relative to their climatological frequencies. Adding predictors with geographic information and location-specific normalization of forecast information permits the use of a single ANN for the entire forecast domain and thus reduces the risk of overfitting. In addition, a convolutional neural network (CNN) framework is proposed that extends the basic ANN and takes images of large-scale predictors as inputs that inform local increase or decrease of precipitation probabilities relative to climatology. Both methods are demonstrated with ECMWF ensemble reforecasts over California for lead times up to 4 weeks. They compare favorably with a state-of-the-art postprocessing technique developed for medium-range ensemble precipitation forecasts, and their forecast skill relative to climatology is positive everywhere within the domain. The magnitude of skill, however, is low for week-3 and week-4, and suggests that additional sources of predictability need to be explored.


2021 ◽  
Vol 11 (22) ◽  
pp. 10852
Author(s):  
Gregor Skok ◽  
Doruntina Hoxha ◽  
Žiga Zaplotnik

This study investigates the potential of direct prediction of daily extremes of temperature at 2 m from a vertical profile measurement using neural networks (NNs). The analysis is based on 3800 daily profiles measured in the period 2004–2019. Various setups of dense sequential NNs are trained to predict the daily extremes at different lead times ranging from 0 to 500 days into the future. The short- to medium-range forecasts rely mainly on the profile data from the lowest layer—mostly on the temperature in the lowest 1 km. For the long-range forecasts (e.g., 100 days), the NN relies on the data from the whole troposphere. The error increases with forecast lead time, but at the same time, it exhibits periodic behavior for long lead times. The NN forecast beats the persistence forecast but becomes worse than the climatological forecast on day two or three. The forecast slightly improves when the previous-day measurements of temperature extremes are added as a predictor. The best forecast is obtained when the climatological value is added as well, with the biggest improvement in the long-term range where the error is constrained to the climatological forecast error.


2020 ◽  
Vol 21 (7) ◽  
pp. 1405-1423
Author(s):  
Zachary P. Brodeur ◽  
Scott Steinschneider

AbstractForecasts of heavy precipitation delivered by atmospheric rivers (ARs) are becoming increasingly important for both flood control and water supply management in reservoirs across California. This study examines the hypothesis that medium-range forecasts of heavy precipitation at the basin scale exhibit recurrent spatial biases that are driven by mesoscale and synoptic-scale features of associated AR events. This hypothesis is tested for heavy precipitation events in the Sacramento River basin using 36 years of NCEP medium-range reforecasts from 1984 to 2019. For each event we cluster precipitation forecast error across western North America for lead times ranging from 1 to 15 days. Integrated vapor transport (IVT), 500-hPa geopotential heights, and landfall characteristics of ARs are composited across clusters and lead times to diagnose the causes of precipitation forecast biases. We investigate the temporal evolution of forecast error to characterize its persistence across lead times, and explore the accuracy of forecasted IVT anomalies across different domains of the North American west coast during heavy precipitation events in the Sacramento basin. Our results identify recurrent spatial patterns of precipitation forecast error consistent with errors of forecasted synoptic-scale features, especially at long (5–15 days) leads. Moreover, we find evidence that forecasts of AR landfalls well outside of the latitudinal bounds of the Sacramento basin precede heavy precipitation events within the basin. These results suggest the potential for using medium-range forecasts of large-scale climate features across the Pacific–North American sector, rather than just local forecasts of basin-scale precipitation, when designing forecast-informed reservoir operations.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Edward Wheatcroft

Abstract A scoring rule is a function of a probabilistic forecast and a corresponding outcome used to evaluate forecast performance. There is some debate as to which scoring rules are most appropriate for evaluating forecasts of sporting events. This paper focuses on forecasts of the outcomes of football matches. The ranked probability score (RPS) is often recommended since it is ‘sensitive to distance’, that is it takes into account the ordering in the outcomes (a home win is ‘closer’ to a draw than it is to an away win). In this paper, this reasoning is disputed on the basis that it adds nothing in terms of the usual aims of using scoring rules. A local scoring rule is one that only takes the probability placed on the outcome into consideration. Two simulation experiments are carried out to compare the performance of the RPS, which is non-local and sensitive to distance, the Brier score, which is non-local and insensitive to distance, and the Ignorance score, which is local and insensitive to distance. The Ignorance score outperforms both the RPS and the Brier score, casting doubt on the value of non-locality and sensitivity to distance as properties of scoring rules in this context.


2011 ◽  
Vol 139 (6) ◽  
pp. 1960-1971 ◽  
Author(s):  
Jakob W. Messner ◽  
Georg J. Mayr

Abstract Three methods to make probabilistic weather forecasts by using analogs are presented and tested. The basic idea of these methods is that finding similar NWP model forecasts to the current one in an archive of past forecasts and taking the corresponding analyses as prediction should remove all systematic errors of the model. Furthermore, this statistical postprocessing can convert NWP forecasts to forecasts for point locations and easily turn deterministic forecasts into probabilistic ones. These methods are tested in the idealized Lorenz96 system and compared to a benchmark bracket formed by ensemble relative frequencies from direct model output and logistic regression. The analog methods excel at longer lead times.


2011 ◽  
Vol 26 (5) ◽  
pp. 664-676 ◽  
Author(s):  
Thierry Dupont ◽  
Matthieu Plu ◽  
Philippe Caroff ◽  
Ghislain Faure

Abstract Several tropical cyclone forecasting centers issue uncertainty information with regard to their official track forecasts, generally using the climatological distribution of position error. However, such methods are not able to convey information that depends on the situation. The purpose of the present study is to assess the skill of the Ensemble Prediction System (EPS) from the European Centre for Medium-Range Weather Forecasts (ECMWF) at measuring the uncertainty of up to 3-day track forecasts issued by the Regional Specialized Meteorological Centre (RSMC) La Réunion in the southwestern Indian Ocean. The dispersion of cyclone positions in the EPS is extracted and translated at the RSMC forecast position. The verification relies on existing methods for probabilistic forecasts that are presently adapted to a cyclone-position metric. First, the probability distribution of forecast positions is compared to the climatological distribution using Brier scores. The probabilistic forecasts have better scores than the climatology, particularly after applying a simple calibration scheme. Second, uncertainty circles are built by fixing the probability at 75%. Their skill at detecting small and large error values is assessed. The circles have some skill for large errors up to the 3-day forecast (and maybe after); but the detection of small radii is skillful only up to 2-day forecasts. The applied methodology may be used to assess and to compare the skill of different probabilistic forecasting systems of cyclone position.


2021 ◽  
Author(s):  
Natalia Korhonen ◽  
Otto Hyvärinen ◽  
Matti Kämäräinen ◽  
Kirsti Jylhä

<p>Severe heatwaves have harmful impacts on ecosystems and society. Early warning of heat waves help with decreasing their harmful impact. Previous research shows that the Extended Range Forecasts (ERF) of the European Centre for Medium-Range Weather Forecasts (ECMWF) have over Europe a somewhat higher reforecast skill for extreme hot summer temperatures than for long-term mean temperatures. Also it has been shown that the reforecast skill of the ERFs of the ECMWF was strongly increased by the most severe heat waves (the European heatwave 2003 and the Russian heatwave 2010).</p><p>Our aim is to be able to estimate the skill of a heat wave forecast at the time the forecast is given. For that we investigated the spatial and temporal reforecast skill of the ERFs of the ECMWF to forecast hot days (here defined as a day on which the 5 days running mean surface temperature is above its summer 90<sup>th</sup> percentile) in the continental Europe in summers 2000-2019. We used the ECMWF 2-meter temperature reforecasts and verified them against the ERA5 reanalysis. The skill of the hot day reforecasts was estimated by the symmetric extremal dependence index (SEDI) which considers both hit rates and false alarm rates of the hot day forecasts. Further, we investigated the skill of the heatwave reforecasts based on at which time steps of the forecast the hot days were forecasted. We found that on the mesoscale (horizontal scale of ~500 km) the ERFs of the ECMWF were most skillful in predicting the life cycle of a heat wave (lasting up to 25 days) about a week before its start and during its course. That is, on the mesoscale those reforecasts, in which hot day(s) were forecasted to occur during the first 7…11 days, were more skillful on lead times up to 25 days than the rest of the heat wave forecasts. This finding is valuable information, e.g., in the energy and health sectors while preparing for a coming heat wave.</p><p>The work presented here is part of the research project HEATCLIM (Heat and health in the changing climate) funded by the Academy of Finland.</p>


2011 ◽  
Vol 139 (2) ◽  
pp. 332-350 ◽  
Author(s):  
Charles Jones ◽  
Jon Gottschalck ◽  
Leila M. V. Carvalho ◽  
Wayne Higgins

Abstract Extreme precipitation events are among the most devastating weather phenomena since they are frequently accompanied by loss of life and property. This study uses reforecasts of the NCEP Climate Forecast System (CFS.v1) to evaluate the skill of nonprobabilistic and probabilistic forecasts of extreme precipitation in the contiguous United States (CONUS) during boreal winter for lead times up to two weeks. The CFS model realistically simulates the spatial patterns of extreme precipitation events over the CONUS, although the magnitudes of the extremes in the model are much larger than in the observations. Heidke skill scores (HSS) for forecasts of extreme precipitation at the 75th and 90th percentiles showed that the CFS model has good skill at week 1 and modest skill at week 2. Forecast skill is usually higher when the Madden–Julian oscillation (MJO) is active and has enhanced convection occurring over the Western Hemisphere, Africa, and/or the western Indian Ocean than in quiescent periods. HSS greater than 0.1 extends to lead times of up to two weeks in these situations. Approximately 10%–30% of the CONUS has HSS greater than 0.1 at lead times of 1–14 days when the MJO is active. Probabilistic forecasts for extreme precipitation events at the 75th percentile show improvements over climatology of 0%–40% at 1-day lead and 0%–5% at 7-day leads. The CFS has better skill in forecasting severe extremes (i.e., events exceeding the 90th percentile) at longer leads than moderate extremes (75th percentile). Improvements over climatology between 10% and 30% at leads of 3 days are observed over several areas across the CONUS—especially in California and in the Midwest.


Sign in / Sign up

Export Citation Format

Share Document