A “meta” analysis of the Fractions Skill Score: The limiting case and implications for aggregation

Author(s):  
M.P. Mittermaier

AbstractThe Fractions Skill Score (FSS) is arguably one of the most popular spatial verification metrics in use today. The fraction of grid points exceeding a threshold within a forecast and observed field neighbourhood are examined to compute a score. By definition a perfect forecast has a FSS of 1, and a “no skill” forecast has a score of 0.It is shown that the denominator defines the score’s characteristics. The FSS is undefined for instances where both the forecast and the observed field do not exceed a threshold. In the limiting case, the FSS for a perfect null (zero) forecast is also undefined, unless a threshold of ≥ 0 is used, in which case it would be 1 (i.e. perfect). Furthermore the FSS is 0 if either the forecast or the observed field does not exceed a threshold. This symmetry means it cannot differentiate between what are traditionally referred to as false alarms or misses. Additional supplementary information is required. The FSS is greater than 0 if and only if there are values exceeding a given threshold in both the forecast and the observed field.The magnitude of an overall score computed over many forecasts is sensitive to the pooling method. Zero scores are non-trivial. Excluding them implies excluding all situations associated with false alarms or misses. Omitting near-zero scores is a more credible decision, but only if it can be proven that these are related to spurious artefacts in the observed field. To avoid ambiguity the components of the FSS should be aggregated separately for computing an overall score for most applications and purposes.

Atmosphere ◽  
2020 ◽  
Vol 11 (11) ◽  
pp. 1166
Author(s):  
Hsin-Hung Lin ◽  
Chih-Chien Tsai ◽  
Jia-Chyi Liou ◽  
Yu-Chun Chen ◽  
Chung-Yi Lin ◽  
...  

This study utilized a radar echo extrapolation system, a high-resolution numerical model with radar data assimilation, and three blending schemes including a new empirical one, called the extrapolation adjusted by model prediction (ExAMP), to carry out 150 min reflectivity nowcasting experiments for various heavy rainfall events in Taiwan in 2019. ExAMP features full trust in the pattern of the extrapolated reflectivity with intensity adjustable by numerical model prediction. The spatial performance for two contrasting events shows that the ExAMP scheme outperforms the others for the more accurate prediction of both strengthening and weakening processes. The statistical skill for all the sampled events shows that the nowcasts by ExAMP and the extrapolation system obtain the lowest and second lowest root mean square errors at all the lead time, respectively. In terms of threat scores and bias scores above certain reflectivity thresholds, the ExAMP nowcast may have more grid points of misses for high reflectivity in comparison to extrapolation, but serious overestimation among the points of hits and false alarms is the least likely to happen with the new scheme. Moreover, the event type does not change the performance ranking of the five methods, all of which have the highest predictability for a typhoon event and the lowest for local thunderstorm events.


2018 ◽  
Vol 22 (10) ◽  
pp. 5125-5141 ◽  
Author(s):  
Arun Ravindranath ◽  
Naresh Devineni ◽  
Upmanu Lall ◽  
Paulina Concha Larrauri

Abstract. Water risk management is a ubiquitous challenge faced by stakeholders in the water or agricultural sector. We present a methodological framework for forecasting water storage requirements and present an application of this methodology to risk assessment in India. The application focused on forecasting crop water stress for potatoes grown during the monsoon season in the Satara district of Maharashtra. Pre-season large-scale climate predictors used to forecast water stress were selected based on an exhaustive search method that evaluates for highest ranked probability skill score and lowest root-mean-squared error in a leave-one-out cross-validation mode. Adaptive forecasts were made in the years 2001 to 2013 using the identified predictors and a non-parametric k-nearest neighbors approach. The accuracy of the adaptive forecasts (2001–2013) was judged based on directional concordance and contingency metrics such as hit/miss rate and false alarms. Based on these criteria, our forecasts were correct 9 out of 13 times, with two misses and two false alarms. The results of these drought forecasts were compared with precipitation forecasts from the Indian Meteorological Department (IMD). We assert that it is necessary to couple informative water stress indices with an effective forecasting methodology to maximize the utility of such indices, thereby optimizing water management decisions.


2019 ◽  
Author(s):  
Lerato E Magosi ◽  
Anuj Goel ◽  
Jemma C Hopewell ◽  
Martin Farrall

Abstract Motivation Common small-effect genetic variants that contribute to human complex traits and disease are typically identified using traditional fixed-effect (FE) meta-analysis methods. However, the power to detect genetic associations under FE models deteriorates with increasing heterogeneity, so that some small-effect heterogeneous loci might go undetected. A modified random-effects meta-analysis approach (RE2) was previously developed that is more powerful than traditional fixed and random-effects methods at detecting small-effect heterogeneous genetic associations, the method was updated (RE2C) to identify small-effect heterogeneous variants overlooked by traditional fixed-effect meta-analysis. Here, we re-appraise a large-scale meta-analysis of coronary disease with RE2C to search for small-effect genetic signals potentially masked by heterogeneity in a FE meta-analysis. Results Our application of RE2C suggests a high sensitivity but low specificity of this approach for discovering small-effect heterogeneous genetic associations. We recommend that reports of small-effect heterogeneous loci discovered with RE2C are accompanied by forest plots and standardized predicted random-effects statistics to reveal the distribution of genetic effect estimates across component studies of meta-analyses, highlighting overly influential outlier studies with the potential to inflate genetic signals. Availability and implementation Scripts to calculate standardized predicted random-effects statistics and generate forest plots are available in the getspres R package entitled from https://magosil86.github.io/getspres/. Supplementary information Supplementary data are available at Bioinformatics online.


2009 ◽  
Vol 24 (6) ◽  
pp. 1457-1471 ◽  
Author(s):  
Caren Marzban ◽  
Scott Sandgathe ◽  
Hilary Lyons ◽  
Nicholas Lederer

Abstract Three spatial verification techniques are applied to three datasets. The datasets consist of a mixture of real and artificial forecasts, and corresponding observations, designed to aid in better understanding the effects of global (i.e., across the entire field) displacement and intensity errors. The three verification techniques, each based on well-known statistical methods, have little in common and, so, present different facets of forecast quality. It is shown that a verification method based on cluster analysis can identify “objects” in a forecast and an observation field, thereby allowing for object-oriented verification in the sense that it considers displacement, missed forecasts, and false alarms. A second method compares the observed and forecast fields, not in terms of the objects within them, but in terms of the covariance structure of the fields, as summarized by their variogram. The last method addresses the agreement between the two fields by inferring the function that maps one to the other. The map—generally called optical flow—provides a (visual) summary of the “difference” between the two fields. A further summary measure of that map is found to yield useful information on the distortion error in the forecasts.


Author(s):  
Alan E Murphy ◽  
Brian M Schilder ◽  
Nathan G Skene

Abstract Motivation Genome-wide association studies (GWAS) summary statistics have popularised and accelerated genetic research. However, a lack of standardisation of the file formats used has proven problematic when running secondary analysis tools or performing meta-analysis studies. Results To address this issue, we have developed MungeSumstats, a Bioconductor R package for the standardisation and quality control of GWAS summary statistics. MungeSumstats can handle the most common summary statistic formats, including variant call format (VCF) producing a reformatted, standardised, tabular summary statistic file, VCF or R native data object. Availability MungeSumstats is available on Bioconductor (v 3.13) and can also be found on Github at: https://neurogenomics.github.io/MungeSumstats Supplementary information The analysis deriving the most common summary statistic formats is available at: https://al-murphy.github.io/SumstatFormats


Water ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 349 ◽  
Author(s):  
Mohamed Salem Nashwan ◽  
Shamsuddin Shahid ◽  
Xiaojun Wang

This study assessed the uncertainty in the spatial pattern of rainfall trends in six widely used monthly gridded rainfall datasets for 1979–2010. Bangladesh is considered as the case study area where changes in rainfall are the highest concern due to global warming-induced climate change. The evaluation was based on the ability of the gridded data to estimate the spatial patterns of the magnitude and significance of annual and seasonal rainfall trends estimated using Mann–Kendall (MK) and modified MK (mMK) tests at 34 gauges. A set of statistical indices including Kling–Gupta efficiency, modified index of agreement (md), skill score (SS), and Jaccard similarity index (JSI) were used. The results showed a large variation in the spatial patterns of rainfall trends obtained using different gridded datasets. Global Precipitation Climatology Centre (GPCC) data was found to be the most suitable rainfall data for the assessment of annual and seasonal rainfall trends in Bangladesh which showed a JSI, md, and SS of 22%, 0.61, and 0.73, respectively, when compared with the observed annual trend. Assessment of long-term trend in rainfall (1901–2017) using mMK test revealed no change in annual rainfall and changes in seasonal rainfall only at a few grid points in Bangladesh over the last century.


2014 ◽  
Vol 29 (6) ◽  
pp. 1451-1472 ◽  
Author(s):  
Jamie K. Wolff ◽  
Michelle Harrold ◽  
Tressa Fowler ◽  
John Halley Gotway ◽  
Louisa Nance ◽  
...  

Abstract While traditional verification methods are commonly used to assess numerical model quantitative precipitation forecasts (QPFs) using a grid-to-grid approach, they generally offer little diagnostic information or reasoning behind the computed statistic. On the other hand, advanced spatial verification techniques, such as neighborhood and object-based methods, can provide more meaningful insight into differences between forecast and observed features in terms of skill with spatial scale, coverage area, displacement, orientation, and intensity. To demonstrate the utility of applying advanced verification techniques to mid- and coarse-resolution models, the Developmental Testbed Center (DTC) applied several traditional metrics and spatial verification techniques to QPFs provided by the Global Forecast System (GFS) and operational North American Mesoscale Model (NAM). Along with frequency bias and Gilbert skill score (GSS) adjusted for bias, both the fractions skill score (FSS) and Method for Object-Based Diagnostic Evaluation (MODE) were utilized for this study with careful consideration given to how these methods were applied and how the results were interpreted. By illustrating the types of forecast attributes appropriate to assess with the spatial verification techniques, this paper provides examples of how to obtain advanced diagnostic information to help identify what aspects of the forecast are or are not performing well.


2018 ◽  
Author(s):  
Arun Ravindranath ◽  
Naresh Devineni ◽  
Upmanu Lall ◽  
Paulina Concha Larrauri

Abstract. Water risk management is perhaps the most ubiquitous challenge a stakeholder in the water or agricultural sector faces. We present a methodological framework for forecasting water storage requirements and present an application of this methodology to risk assessment in India. The application focused on forecasting crop water stress for potatoes grown during the monsoon season in the Satara district of Maharashtra. Pre-season large-scale climate predictors used to forecast water stress were selected based on an exhaustive search method that evaluates for highest Rank Probability Skill Score and lowest Mean Squared Error in a leave-one-out cross validation mode. Adaptive forecasts were made over the years 2001 through 2013 using the identified predictors and a semi-parametric k-nearest neighbors approach. The accuracy of the adaptive forecasts (2001–2013) was judged based on directional concordance and contingency metrics such as hit/miss rate and false alarms. Based on these criteria, our forecasts were correct nine out of thirteen times, with two misses and two false alarms. The results of these drought forecasts were compared with precipitation forecasts from the Indian Meteorological Department (IMD). We assert that it is necessary to couple informative water stress/risk indices with an effective forecasting methodology to maximize the utility of such indices, thereby optimizing water management decisions.


2014 ◽  
Vol 6 (2) ◽  
pp. 288-299 ◽  
Author(s):  
K. Srinivasa Raju ◽  
D. Nagesh Kumar

Eleven general circulation models/global climate models (GCMs) – BCCR-BCCM2.0, INGV-ECHAM4, GFDL2.0, GFDL2.1, GISS, IPSL-CM4, MIROC3, MRI-CGCM2, NCAR-PCMI, UKMO-HADCM3 and UKMO-HADGEM1 – are evaluated for Indian climate conditions using the performance indicator, skill score (SS). Two climate variables, temperature T (at three levels, i.e. 500, 700, 850 mb) and precipitation rate (Pr) are considered resulting in four SS-based evaluation criteria (T500, T700, T850, Pr). The multicriterion decision-making method, technique for order preference by similarity to an ideal solution, is applied to rank 11 GCMs. Efforts are made to rank GCMs for the Upper Malaprabha catchment and two river basins, namely, Krishna and Mahanadi (covered by 17 and 15 grids of size 2.5° × 2.5°, respectively). Similar efforts are also made for India (covered by 73 grid points of size 2.5° × 2.5°) for which an ensemble of GFDL2.0, INGV-ECHAM4, UKMO-HADCM3, MIROC3, BCCR-BCCM2.0 and GFDL2.1 is found to be suitable. It is concluded that the proposed methodology can be applied to similar situations with ease.


2017 ◽  
Vol 56 (8) ◽  
pp. 2335-2352 ◽  
Author(s):  
Peter Ukkonen ◽  
Agostino Manzato ◽  
Antti Mäkelä

AbstractThis work evaluates numerous thunderstorm predictors and investigates the use of artificial neural networks (ANNs) for identifying occurrences of thunderstorms in reanalysis data. Environmental conditions favorable for deep, moist convection are derived from 6-hourly ERA-Interim reanalyses, while thunderstorm occurrence in the following 6 h over Finland is derived from lightning location data. By taking advantage of the consistency and large sample size (14 summers) provided by the reanalysis, complex multivariate models can be trained for a robust estimation of convective weather events from model data. This and other methods are used to yield information on the most effective convective predictors in a multivariate setting, which can also benefit the forecasting community. The best ANN found uses 15 inputs and received a Heidke skill score (HSS) of 0.51 on an independent test sample. This is a substantial improvement over the best predictor when used alone, the most unstable lifted index (MULI) with HSS = 0.40, the multivariate model having fewer false alarms in particular. After MULI, the most important ANN input was relative humidity near 700 hPa. Dry air aloft was associated with significantly lower thunderstorm probability and flash density regardless of convective available potential energy (CAPE). Other important parameters for thunderstorm development were vertical velocity and low-level θe advection. Finally, the Peirce skill score indicates a clear meridional gradient in skill for categorical forecasts, with higher skill in northern Finland. This analysis suggests that the difference in skill is real and associated with a steeper thunderstorm probability curve in the north, but further studies are needed for a physical explanation.


Sign in / Sign up

Export Citation Format

Share Document