proper scoring rules
Recently Published Documents


TOTAL DOCUMENTS

55
(FIVE YEARS 12)

H-INDEX

14
(FIVE YEARS 1)

2021 ◽  
Vol 8 (24) ◽  
pp. 297-301
Author(s):  
Jonas Brehmer

Proper scoring rules enable decision-theoretically principled comparisons of probabilistic forecasts. New scoring rules can be constructed by identifying the predictive distribution with an element of a parametric family and then applying a known scoring rule. We introduce a condition which ensures propriety in this construction and thereby obtain novel proper scoring rules.


2021 ◽  
Author(s):  
Francesco Serafini ◽  
Mark Naylor ◽  
Finn Lindgren ◽  
Maximilian Werner

<p>Recent years have seen a growth in the diversity of probabilistic earthquake forecasts as well as the advent of them being applied operationally. The growth of their use demands a deeper look at our ability to rank their performance within a transparent and unified framework. Programs such as the Collaboratory Study for Earthquake Predictability (CSEP)  have been at the forefront of this effort. Scores are quantitative measures of how well a dataset can be explained by a candidate forecast and allow forecasts to be ranked. A positively oriented score is said to be proper when, on average, the highest score is achieved by the closest model to the data generating one. Different meanings of closest lead to different proper scoring rules. Here, we prove that the Parimutuel Gambling score, used to evaluate the results of the 2009 Italy CSEP experiment, is generally not proper, and even for the special case where it is proper, it can still be used improperly. We show in detail the possible consequences of using this score for forecast evaluation. Moreover, we show that other well-established scores can be applied to existing studies to calculate new rankings with no requirement for extra information. We extend the analysis to show how much data are required, in principle, to distinguish candidate forecasts and therefore how likely it is to express a preference towards a forecast. This introduces the possibility of survey design with regard to the duration and spatial discretisation of earthquake forecasts. Our findings may contribute to more rigorous statements about the ability to distinguish between the predictive skills of candidate forecasts in addition to simple rankings.</p>


Author(s):  
Hailiang Du

AbstractThe evaluation of probabilistic forecasts plays a central role both in the interpretation and in the use of forecast systems and their development. Probabilistic scores (scoring rules) provide statistical measures to assess the quality of probabilistic forecasts. Often, many probabilistic forecast systems are available while evaluations of their performance are not standardized, with different scoring rules being used to measure different aspects of forecast performance. Even when the discussion is restricted to strictly proper scoring rules, there remains considerable variability between them; indeed strictly proper scoring rules need not rank competing forecast systems in the same order when none of these systems are perfect. The locality property is explored to further distinguish scoring rules. The nonlocal strictly proper scoring rules considered are shown to have a property that can produce “unfortunate” evaluations. Particularly the fact that Continuous Rank Probability Score prefers the outcome close to the median of the forecast distribution regardless the probability mass assigned to the value at/near the median raises concern to its use. The only local strictly proper scoring rules, the logarithmic score, has direct interpretations in terms of probabilities and bits of information. The nonlocal strictly proper scoring rules, on the other hand, lack meaningful direct interpretation for decision support. The logarithmic score is also shown to be invariant under smooth transformation of the forecast variable, while the nonlocal strictly proper scoring rules considered may, however, change their preferences due to the transformation. It is therefore suggested that the logarithmic score always be included in the evaluation of probabilistic forecasts.


2020 ◽  
Vol 17 (2) ◽  
pp. 115-133
Author(s):  
Zachary J. Smith ◽  
J. Eric Bickel

In this paper, we develop strictly proper scoring rules that may be used to evaluate the accuracy of a sequence of probabilistic forecasts. In practice, when forecasts are submitted for multiple uncertainties, competing forecasts are ranked by their cumulative or average score. Alternatively, one could score the implied joint distributions. We demonstrate that these measures of forecast accuracy disagree under some commonly used rules. Furthermore, and most importantly, we show that forecast rankings can depend on the selected scoring procedure. In other words, under some scoring rules, the relative ranking of probabilistic forecasts does not depend solely on the information content of those forecasts and the observed outcome. Instead, the relative ranking of forecasts is a function of the process by which those forecasts are evaluated. As an alternative, we describe additive and strongly additive strictly proper scoring rules, which have the property that the score for the joint distribution is equal to a sum of scores for the associated marginal and conditional distributions. We give methods for constructing additive rules and demonstrate that the logarithmic score is the only strongly additive rule. Finally, we connect the additive properties of scoring rules with analogous properties for a general class of entropy measures.


2019 ◽  
Vol 117 ◽  
pp. 322-341
Author(s):  
Christopher P. Chambers ◽  
Paul J. Healy ◽  
Nicolas S. Lambert

Sign in / Sign up

Export Citation Format

Share Document