The Area Skill Score Statistic for Evaluating Earthquake Predictability Experiments

Author(s):  
J. Douglas Zechar ◽  
Thomas H. Jordan
2021 ◽  
Author(s):  
Nicola Cortesi ◽  
Verónica Torralba ◽  
Llorenó Lledó ◽  
Andrea Manrique-Suñén ◽  
Nube Gonzalez-Reviriego ◽  
...  

AbstractIt is often assumed that weather regimes adequately characterize atmospheric circulation variability. However, regime classifications spanning many months and with a low number of regimes may not satisfy this assumption. The first aim of this study is to test such hypothesis for the Euro-Atlantic region. The second one is to extend the assessment of sub-seasonal forecast skill in predicting the frequencies of occurrence of the regimes beyond the winter season. Two regime classifications of four regimes each were obtained from sea level pressure anomalies clustered from October to March and from April to September respectively. Their spatial patterns were compared with those representing the annual cycle. Results highlight that the two regime classifications are able to reproduce most part of the patterns of the annual cycle, except during the transition weeks between the two periods, when patterns of the annual cycle resembling Atlantic Low regime are not also observed in any of the two classifications. Forecast skill of Atlantic Low was found to be similar to that of NAO+, the regime replacing Atlantic Low in the two classifications. Thus, although clustering yearly circulation data in two periods of 6 months each introduces a few deviations from the annual cycle of the regime patterns, it does not negatively affect sub-seasonal forecast skill. Beyond the winter season and the first ten forecast days, sub-seasonal forecasts of ECMWF are still able to achieve weekly frequency correlations of r = 0.5 for some regimes and start dates, including summer ones. ECMWF forecasts beat climatological forecasts in case of long-lasting regime events, and when measured by the fair continuous ranked probability skill score, but not when measured by the Brier skill score. Thus, more efforts have to be done yet in order to achieve minimum skill necessary to develop forecast products based on weather regimes outside winter season.


2021 ◽  
pp. 1-12
Author(s):  
Matthew van Bommel ◽  
Luke Bornn ◽  
Peter Chow-White ◽  
Chuancong Gao

Box score statistics are the baseline measures of performance for National Collegiate Athletic Association (NCAA) basketball. Between the 2011-2012 and 2015-2016 seasons, NCAA teams performed better at home compared to on the road in nearly all box score statistics across both genders and all three divisions. Using box score data from over 100,000 games spanning the three divisions for both women and men, we examine the factors underlying this discrepancy. The prevalence of neutral location games in the NCAA provides an additional angle through which to examine the gaps in box score statistic performance, which we believe has been underutilized in existing literature. We also estimate a regression model to quantify the home court advantages for box score statistics after controlling for other factors such as number of possessions, and team strength. Additionally, we examine the biases of scorekeepers and referees. We present evidence that scorekeepers tend to have greater home team biases when observing men compared to women, higher divisions compared to lower divisions, and stronger teams compared to weaker teams. Finally, we present statistically significant results indicating referee decisions are impacted by attendance, with larger crowds resulting in greater bias in favor of the home team.


Author(s):  
Hermann Anetzberger ◽  
Stephan Reppenhagen ◽  
Hansjörg Eickhoff ◽  
Franz Josef Seibert ◽  
Bernd Döring ◽  
...  

2013 ◽  
Vol 28 (3) ◽  
pp. 802-814 ◽  
Author(s):  
Timothy W. Armistead

Abstract The paper briefly reviews measures that have been proposed since the 1880s to assess accuracy and skill in categorical weather forecasting. The majority of the measures consist of a single expression, for example, a proportion, the difference between two proportions, a ratio, or a coefficient. Two exemplar single-expression measures for 2 × 2 categorical arrays that chronologically bracket the 130-yr history of this effort—Doolittle's inference ratio i and Stephenson's odds ratio skill score (ORSS)—are reviewed in detail. Doolittle's i is appropriately calculated using conditional probabilities, and the ORSS is a valid measure of association, but both measures are limited in ways that variously mirror all single-expression measures for categorical forecasting. The limitations that variously affect such measures include their inability to assess the separate accuracy rates of different forecast–event categories in a matrix, their sensitivity to the interdependence of forecasts in a 2 × 2 matrix, and the inapplicability of many of them to the general k × k (k ≥ 2) problem. The paper demonstrates that Wagner's unbiased hit rate, developed for use in categorical judgment studies with any k × k (k ≥ 2) array, avoids these limitations while extending the dual-measure Bayesian approach proposed by Murphy and Winkler in 1987.


1991 ◽  
Vol 36 (1) ◽  
pp. 103-112 ◽  
Author(s):  
Tapas K. Chandra ◽  
Rahul Mukerjee

2010 ◽  
Vol 27 (3) ◽  
pp. 409-427 ◽  
Author(s):  
Kun Tao ◽  
Ana P. Barros

Abstract The objective of spatial downscaling strategies is to increase the information content of coarse datasets at smaller scales. In the case of quantitative precipitation estimation (QPE) for hydrological applications, the goal is to close the scale gap between the spatial resolution of coarse datasets (e.g., gridded satellite precipitation products at resolution L × L) and the high resolution (l × l; L ≫ l) necessary to capture the spatial features that determine spatial variability of water flows and water stores in the landscape. In essence, the downscaling process consists of weaving subgrid-scale heterogeneity over a desired range of wavelengths in the original field. The defining question is, which properties, statistical and otherwise, of the target field (the known observable at the desired spatial resolution) should be matched, with the caveat that downscaling methods be as a general as possible and therefore ideally without case-specific constraints and/or calibration requirements? Here, the attention is focused on two simple fractal downscaling methods using iterated functions systems (IFS) and fractal Brownian surfaces (FBS) that meet this requirement. The two methods were applied to disaggregate spatially 27 summertime convective storms in the central United States during 2007 at three consecutive times (1800, 2100, and 0000 UTC, thus 81 fields overall) from the Tropical Rainfall Measuring Mission (TRMM) version 6 (V6) 3B42 precipitation product (∼25-km grid spacing) to the same resolution as the NCEP stage IV products (∼4-km grid spacing). Results from bilinear interpolation are used as the control. A fundamental distinction between IFS and FBS is that the latter implies a distribution of downscaled fields and thus an ensemble solution, whereas the former provides a single solution. The downscaling effectiveness is assessed using fractal measures (the spectral exponent β, fractal dimension D, Hurst coefficient H, and roughness amplitude R) and traditional operational scores statistics scores [false alarm rate (FR), probability of detection (PD), threat score (TS), and Heidke skill score (HSS)], as well as bias and the root-mean-square error (RMSE). The results show that both IFS and FBS fractal interpolation perform well with regard to operational skill scores, and they meet the additional requirement of generating structurally consistent fields. Furthermore, confidence intervals can be directly generated from the FBS ensemble. The results were used to diagnose errors relevant for hydrometeorological applications, in particular a spatial displacement with characteristic length of at least 50 km (2500 km2) in the location of peak rainfall intensities for the cases studied.


2016 ◽  
Vol 29 (17) ◽  
pp. 6065-6083 ◽  
Author(s):  
Yinghui Liu ◽  
Jeffrey R. Key

Abstract Cloud cover is one of the largest uncertainties in model predictions of the future Arctic climate. Previous studies have shown that cloud amounts in global climate models and atmospheric reanalyses vary widely and may have large biases. However, many climate studies are based on anomalies rather than absolute values, for which biases are less important. This study examines the performance of five atmospheric reanalysis products—ERA-Interim, MERRA, MERRA-2, NCEP R1, and NCEP R2—in depicting monthly mean Arctic cloud amount anomalies against Moderate Resolution Imaging Spectroradiometer (MODIS) satellite observations from 2000 to 2014 and against Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) observations from 2006 to 2014. All five reanalysis products exhibit biases in the mean cloud amount, especially in winter. The Gerrity skill score (GSS) and correlation analysis are used to quantify their performance in terms of interannual variations. Results show that ERA-Interim, MERRA, MERRA-2, and NCEP R2 perform similarly, with annual mean GSSs of 0.36/0.22, 0.31/0.24, 0.32/0.23, and 0.32/0.23 and annual mean correlation coefficients of 0.50/0.51, 0.43/0.54, 0.44/0.53, and 0.50/0.52 against MODIS/CALIPSO, indicating that the reanalysis datasets do exhibit some capability for depicting the monthly mean cloud amount anomalies. There are no significant differences in the overall performance of reanalysis products. They all perform best in July, August, and September and worst in November, December, and January. All reanalysis datasets have better performance over land than over ocean. This study identifies the magnitudes of errors in Arctic mean cloud amounts and anomalies and provides a useful tool for evaluating future improvements in the cloud schemes of reanalysis products.


2014 ◽  
Vol 142 (2) ◽  
pp. 716-738 ◽  
Author(s):  
Craig S. Schwartz ◽  
Zhiquan Liu

Abstract Analyses with 20-km horizontal grid spacing were produced from parallel continuously cycling three-dimensional variational (3DVAR), ensemble square root Kalman filter (EnSRF), and “hybrid” variational–ensemble data assimilation (DA) systems between 0000 UTC 6 May and 0000 UTC 21 June 2011 over a domain spanning the contiguous United States. Beginning 9 May, the 0000 UTC analyses initialized 36-h Weather Research and Forecasting Model (WRF) forecasts containing a large convection-permitting 4-km nest. These 4-km 3DVAR-, EnSRF-, and hybrid-initialized forecasts were compared to benchmark WRF forecasts initialized by interpolating 0000 UTC Global Forecast System (GFS) analyses onto the computational domain. While important differences regarding mean state characteristics of the 20-km DA systems were noted, verification efforts focused on the 4-km precipitation forecasts. The 3DVAR-, hybrid-, and EnSRF-initialized 4-km precipitation forecasts performed similarly regarding general precipitation characteristics, such as timing of the diurnal cycle, and all three forecast sets had high precipitation biases at heavier rainfall rates. However, meaningful differences emerged regarding precipitation placement as quantified by the fractions skill score. For most forecast hours, the hybrid-initialized 4-km precipitation forecasts were better than the EnSRF-, 3DVAR-, and GFS-initialized forecasts, and the improvement was often statistically significant at the 95th percentile. These results demonstrate the potential of limited-area continuously cycling hybrid DA configurations and suggest additional hybrid development is warranted.


2013 ◽  
Vol 141 (10) ◽  
pp. 3477-3497 ◽  
Author(s):  
Mingyue Chen ◽  
Wanqiu Wang ◽  
Arun Kumar

Abstract An analysis of lagged ensemble seasonal forecasts from the National Centers for Environmental Prediction (NCEP) Climate Forecast System, version 2 (CFSv2), is presented. The focus of the analysis is on the construction of lagged ensemble forecasts with increasing lead time (thus allowing use of larger ensemble sizes) and its influence on seasonal prediction skill. Predictions of seasonal means of sea surface temperature (SST), 200-hPa height (z200), precipitation, and 2-m air temperature (T2m) over land are analyzed. Measures of prediction skill include deterministic (anomaly correlation and mean square error) and probabilistic [rank probability skill score (RPSS)]. The results show that for a fixed lead time, and as one would expect, the skill of seasonal forecast improves as the ensemble size increases, while for a fixed ensemble size the forecast skill decreases as the lead time becomes longer. However, when a forecast is based on a lagged ensemble, there exists an optimal lagged ensemble time (OLET) when positive influence of increasing ensemble size and negative influence due to an increasing lead time result in a maximum in seasonal prediction skill. The OLET is shown to depend on the geographical location and variable. For precipitation and T2m, OLET is relatively longer and skill gain is larger than that for SST and tropical z200. OLET is also dependent on the skill measure with RPSS having the longest OLET. Results of this analysis will be useful in providing guidelines on the design and understanding relative merits for different configuration of seasonal prediction systems.


Sign in / Sign up

Export Citation Format

Share Document