scholarly journals Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores

2019 ◽  
Vol 23 (10) ◽  
pp. 4323-4331 ◽  
Author(s):  
Wouter J. M. Knoben ◽  
Jim E. Freer ◽  
Ross A. Woods

Abstract. A traditional metric used in hydrology to summarize model performance is the Nash–Sutcliffe efficiency (NSE). Increasingly an alternative metric, the Kling–Gupta efficiency (KGE), is used instead. When NSE is used, NSE = 0 corresponds to using the mean flow as a benchmark predictor. The same reasoning is applied in various studies that use KGE as a metric: negative KGE values are viewed as bad model performance, and only positive values are seen as good model performance. Here we show that using the mean flow as a predictor does not result in KGE = 0, but instead KGE =1-√2≈-0.41. Thus, KGE values greater than −0.41 indicate that a model improves upon the mean flow benchmark – even if the model's KGE value is negative. NSE and KGE values cannot be directly compared, because their relationship is non-unique and depends in part on the coefficient of variation of the observed time series. Therefore, modellers who use the KGE metric should not let their understanding of NSE values guide them in interpreting KGE values and instead develop new understanding based on the constitutive parts of the KGE metric and the explicit use of benchmark values to compare KGE scores against. More generally, a strong case can be made for moving away from ad hoc use of aggregated efficiency metrics and towards a framework based on purpose-dependent evaluation metrics and benchmarks that allows for more robust model adequacy assessment.

2019 ◽  
Author(s):  
Wouter J. M. Knoben ◽  
Jim E. Freer ◽  
Ross A. Woods

Abstract. A traditional metric used in hydrology to summarize model performance is the Nash-Sutcliffe Efficiency (NSE). Increasingly an alternative metric, the Kling-Gupta Efficiency (KGE), is used instead. When NSE is used, NSE = 0 corresponds to using the mean flow as a benchmark predictor. The same reasoning is applied in various studies that use KGE as a metric: negative KGE values are often viewed in the literature as bad model performance and positive values are seen as good model performance. Here we show that using the mean flow as a predictor does not result in KGE = 0, but instead KGE = 1−√2 ≈ −0.41. Thus, KGE values greater than −0.41 indicate that a model improves upon the mean flow benchmark – even if the model's KGE value is negative. NSE and KGE values cannot be directly compared, because their relationship is non-unique and depends in part on the coefficient of variation of the observed time series. Therefore, we argue that modellers should not let their understanding of NSE values guide them in interpreting KGE values and instead develop new understanding based on the constitutive parts of the KGE metric and the explicit use of benchmark values to compare KGE scores against.


Author(s):  
D. Prandle

An estimate is made of the mean value of residual flow through the Dover Strait for each month over the 24–year period from 1949 to 1972. The estimates are based on results from a modelling investigation by Prandle (1978) where it was shown that the residual flow consists of three components, (a) a tidal residual, (b), a wind-driven residual and (c) a flow due to a long-term gradient in mean sea level. The components (a) and (c) are assumed to be constant and the value of (b) is deduced using wind data recorded by Dutch Light Vessels located in the southern North Sea.The mean flow over the whole period amounts to 155 × 103 m3 s–1 into the North Sea with a maximum value of 364 x 103 m3 s–1 and a minimum of – 15 × 103 m3 s–1 (out of the North Sea). One notable feature of the complete time series is the surprisingly small variation in the annual mean flows; perhaps this stability in the annual flow is of significance to the marine biology of the area.The validity of the computed time series is established by reference to comparable data including a 9–year record, from cross-channel submarine cables, of the potential induced by the flow of water through the Earth's magnetic field. Additional comparisons are also made with the results of a previous study of daily-mean flows.


2019 ◽  
Vol 30 (8) ◽  
pp. 3985-4011
Author(s):  
Nikhil Kalkote ◽  
Ashutosh Kumar ◽  
Ashwani Assam ◽  
Vinayak Eswaran

Purpose The purpose of this paper is to study the predictability of the recently proposed length scale-based two-equation k-kL model for external aerodynamic flows such as those also encountered in the high-lift devices. Design/methodology/approach The two-equation k-kL model solves the transport equations of turbulent kinetic energy (TKE) and the product of TKE and the integral length scale to obtain the effect of turbulence on the mean flow field. In theory, the use of governing equation for length scale (kL) along with the TKE promises applicability in a wide range of applications in both free-shear and wall-bounded flows with eddy-resolving capability. Findings The model is implemented in the in-house unstructured grid computational fluid dynamics solver to investigate its performance for airfoils in difficult-to-predict situations, including stalling and separation. The numerical findings show the good capability of the model in handling the complex flow physics in the external aerodynamic computations. Originality/value The model performance is studied for stationary turbulent external aerodynamic flows, using five different airfoils, including two multi-element airfoils in high-lift configurations which, in the knowledge of the authors, have not been simulated with k-kL model until now.


2018 ◽  
Vol 5 (4) ◽  
pp. e63 ◽  
Author(s):  
Antoine Nzeyimana ◽  
Kate EA Saunders ◽  
John R Geddes ◽  
Patrick E McSharry

Background Depression in people with bipolar disorder is a major cause of long-term disability, possibly leading to early mortality and currently, limited safe and effective therapies exist. Although existing monotherapies such as quetiapine have limited proven efficacy and practical tolerability, treatment combinations may lead to improved outcomes. Lamotrigine is an anticonvulsant currently licensed for the prevention of depressive relapses in individuals with bipolar disorder. A double-blinded randomized placebo-controlled trial (comparative evaluation of Quetiapine-Lamotrigine [CEQUEL] study) was conducted to evaluate the efficacy of lamotrigine plus quetiapine versus quetiapine monotherapy in patients with bipolar type I or type II disorders. Objective Because the original CEQUEL study found significant depressive symptom improvements, the objective of this study was to reanalyze CEQUEL data and determine an unbiased classification accuracy for active lamotrigine versus placebo. We also wanted to establish the time it took for the drug to provide statistically significant outcomes. Methods Between October 21, 2008 and April 27, 2012, 202 participants from 27 sites in United Kingdom were randomly assigned to two treatments; 101: lamotrigine, 101: placebo. The primary variable used for estimating depressive symptoms was based on the Quick Inventory of Depressive Symptomatology—self report version 16 (QIDS-SR16). The original CEQUEL study findings were confirmed by performing t test and linear regression. Multiple features were computed from the QIDS-SR16 time series; different linear and nonlinear binary classifiers were trained to distinguish between the two groups. Various feature-selection techniques were used to select a feature set with the greatest explanatory power; a 10-fold cross-validation was used. Results From weeks 10 to 14, the mean difference in QIDS-SR16 ratings between the groups was −1.6317 (P=.09; sample size=81, 77; 95% CI −0.2403 to 3.5036). From weeks 48 to 52, the mean difference was −2.0032 (P=.09; sample size=54, 48; 95% CI −0.3433 to 4.3497). The coefficient of variation (σ/μ) and detrended fluctuation analysis (DFA) exponent alpha had the greatest explanatory power. The out-of-sample classification accuracy for the 138 participants who reported more than 10 times after week 12 was 62%. A consistent classification accuracy higher than the no-information benchmark was obtained in week 44. Conclusions Adding lamotrigine to quetiapine treatment decreased depressive symptoms in patients with bipolar disorder. Our classification model suggested that lamotrigine increased the coefficient of variation in the QIDS-SR16 scores. The lamotrigine group also tended to have a lower DFA exponent, implying a substantial temporal instability in the time series. The performance of the model over time suggested that a trial of at least 44 weeks was required to achieve consistent results. The selected model confirmed the original CEQUEL study findings and helped in understanding the temporal dynamics of bipolar depression during treatment. Trial Registration EudraCT Number 2007-004513-33; https://www.clinicaltrialsregister.eu/ctr-search/trial/2007-004513-33/GB (Archived by WebCite at http://www.webcitation.org/73sNaI29O).


2017 ◽  
Author(s):  
Antoine Nzeyimana ◽  
Kate EA Saunders ◽  
John R Geddes ◽  
Patrick E McSharry

BACKGROUND Depression in people with bipolar disorder is a major cause of long-term disability, possibly leading to early mortality and currently, limited safe and effective therapies exist. Although existing monotherapies such as quetiapine have limited proven efficacy and practical tolerability, treatment combinations may lead to improved outcomes. Lamotrigine is an anticonvulsant currently licensed for the prevention of depressive relapses in individuals with bipolar disorder. A double-blinded randomized placebo-controlled trial (comparative evaluation of Quetiapine-Lamotrigine [CEQUEL] study) was conducted to evaluate the efficacy of lamotrigine plus quetiapine versus quetiapine monotherapy in patients with bipolar type I or type II disorders. OBJECTIVE Because the original CEQUEL study found significant depressive symptom improvements, the objective of this study was to reanalyze CEQUEL data and determine an unbiased classification accuracy for active lamotrigine versus placebo. We also wanted to establish the time it took for the drug to provide statistically significant outcomes. METHODS Between October 21, 2008 and April 27, 2012, 202 participants from 27 sites in United Kingdom were randomly assigned to two treatments; 101: lamotrigine, 101: placebo. The primary variable used for estimating depressive symptoms was based on the Quick Inventory of Depressive Symptomatology—self report version 16 (QIDS-SR16). The original CEQUEL study findings were confirmed by performing t test and linear regression. Multiple features were computed from the QIDS-SR16 time series; different linear and nonlinear binary classifiers were trained to distinguish between the two groups. Various feature-selection techniques were used to select a feature set with the greatest explanatory power; a 10-fold cross-validation was used. RESULTS From weeks 10 to 14, the mean difference in QIDS-SR16 ratings between the groups was −1.6317 (P=.09; sample size=81, 77; 95% CI −0.2403 to 3.5036). From weeks 48 to 52, the mean difference was −2.0032 (P=.09; sample size=54, 48; 95% CI −0.3433 to 4.3497). The coefficient of variation (σ/μ) and detrended fluctuation analysis (DFA) exponent alpha had the greatest explanatory power. The out-of-sample classification accuracy for the 138 participants who reported more than 10 times after week 12 was 62%. A consistent classification accuracy higher than the no-information benchmark was obtained in week 44. CONCLUSIONS Adding lamotrigine to quetiapine treatment decreased depressive symptoms in patients with bipolar disorder. Our classification model suggested that lamotrigine increased the coefficient of variation in the QIDS-SR16 scores. The lamotrigine group also tended to have a lower DFA exponent, implying a substantial temporal instability in the time series. The performance of the model over time suggested that a trial of at least 44 weeks was required to achieve consistent results. The selected model confirmed the original CEQUEL study findings and helped in understanding the temporal dynamics of bipolar depression during treatment. CLINICALTRIAL EudraCT Number 2007-004513-33; https://www.clinicaltrialsregister.eu/ctr-search/trial/2007-004513-33/GB (Archived by WebCite at http://www.webcitation.org/73sNaI29O).


2004 ◽  
Vol 155 (5) ◽  
pp. 142-145 ◽  
Author(s):  
Claudio Defila

The record-breaking heatwave of 2003 also had an impact on the vegetation in Switzerland. To examine its influences seven phenological late spring and summer phases were evaluated together with six phases in the autumn from a selection of stations. 30% of the 122 chosen phenological time series in late spring and summer phases set a new record (earliest arrival). The proportion of very early arrivals is very high and the mean deviation from the norm is between 10 and 20 days. The situation was less extreme in autumn, where 20% of the 103 time series chosen set a new record. The majority of the phenological arrivals were found in the class «normal» but the class«very early» is still well represented. The mean precocity lies between five and twenty days. As far as the leaf shedding of the beech is concerned, there was even a slight delay of around six days. The evaluation serves to show that the heatwave of 2003 strongly influenced the phenological events of summer and spring.


2018 ◽  
Vol 5 (01) ◽  
Author(s):  
TAPAN K. KHURA ◽  
H. L. KUSHWAHA ◽  
SATISH D LANDE ◽  
PKSAHOO . ◽  
INDRA L . KUSHWAHA

Floriculture is an age-old farming activity in India having immense potential for generating selfemployment and income to farmers. However, the cost of cultivation of flower is high as compared to cereal crop. Level of mechanization for different field operations is one but foremost reason for the higher cost of cultivation. As most of the Indian farmers are marginal and small, a need for manually operated gladiolus planter was felt. The geometric properties of gladiolus corm were determined for designing the seed metering system and seed hopper of the planter. The planter was evaluated in the field when pulled by two persons as a power source and guided by a person. The coefficient of variation and highest deviation from the mean spacing was observed as 12.93% and 2.65cm respectively. The maximum coefficient of uniformity of 90.59% was observed for a nominal corm spacing of 15cm at 0.56 kmh-1 forward speed. An average MISS percentage was observed as 2.65 and 2.25 for nominal corm spacing of 15 and 20 cm. The multiple index was zero for two levels corm spacing and forward speed of operation. The QFI was found in the range of 97.2 and 97.9 percent. The average field capacity of the planter was observed as 0.02 hah-1.The average draft requirement of the planter was found as 821 ± 50.3 N.


1985 ◽  
Vol 50 (11) ◽  
pp. 2396-2410
Author(s):  
Miloslav Hošťálek ◽  
Ivan Fořt

The study describes a method of modelling axial-radial circulation in a tank with an axial impeller and radial baffles. The proposed model is based on the analytical solution of the equation for vortex transport in the mean flow of turbulent liquid. The obtained vortex flow model is tested by the results of experiments carried out in a tank of diameter 1 m and with the bottom in the shape of truncated cone as well as by the data published for the vessel of diameter 0.29 m with flat bottom. Though the model equations are expressed in a simple form, good qualitative and even quantitative agreement of the model with reality is stated. Apart from its simplicity, the model has other advantages: minimum number of experimental data necessary for the completion of boundary conditions and integral nature of these data.


2009 ◽  
Vol 27 (1) ◽  
pp. 1-30 ◽  
Author(s):  
P. Prikryl ◽  
V. Rušin ◽  
M. Rybanský

Abstract. A sun-weather correlation, namely the link between solar magnetic sector boundary passage (SBP) by the Earth and upper-level tropospheric vorticity area index (VAI), that was found by Wilcox et al. (1974) and shown to be statistically significant by Hines and Halevy (1977) is revisited. A minimum in the VAI one day after SBP followed by an increase a few days later was observed. Using the ECMWF ERA-40 re-analysis dataset for the original period from 1963 to 1973 and extending it to 2002, we have verified what has become known as the "Wilcox effect" for the Northern as well as the Southern Hemisphere winters. The effect persists through years of high and low volcanic aerosol loading except for the Northern Hemisphere at 500 mb, when the VAI minimum is weak during the low aerosol years after 1973, particularly for sector boundaries associated with south-to-north reversals of the interplanetary magnetic field (IMF) BZ component. The "disappearance" of the Wilcox effect was found previously by Tinsley et al. (1994) who suggested that enhanced stratospheric volcanic aerosols and changes in air-earth current density are necessary conditions for the effect. The present results indicate that the Wilcox effect does not require high aerosol loading to be detected. The results are corroborated by a correlation with coronal holes where the fast solar wind originates. Ground-based measurements of the green coronal emission line (Fe XIV, 530.3 nm) are used in the superposed epoch analysis keyed by the times of sector boundary passage to show a one-to-one correspondence between the mean VAI variations and coronal holes. The VAI is modulated by high-speed solar wind streams with a delay of 1–2 days. The Fourier spectra of VAI time series show peaks at periods similar to those found in the solar corona and solar wind time series. In the modulation of VAI by solar wind the IMF BZ seems to control the phase of the Wilcox effect and the depth of the VAI minimum. The mean VAI response to SBP associated with the north-to-south reversal of BZ is leading by up to 2 days the mean VAI response to SBP associated with the south-to-north reversal of BZ. For the latter, less geoeffective events, the VAI minimum deepens (with the above exception of the Northern Hemisphere low-aerosol 500-mb VAI) and the VAI maximum is delayed. The phase shift between the mean VAI responses obtained for these two subsets of SBP events may explain the reduced amplitude of the overall Wilcox effect. In a companion paper, Prikryl et al. (2009) propose a new mechanism to explain the Wilcox effect, namely that solar-wind-generated auroral atmospheric gravity waves (AGWs) influence the growth of extratropical cyclones. It is also observed that severe extratropical storms, explosive cyclogenesis and significant sea level pressure deepenings of extratropical storms tend to occur within a few days of the arrival of high-speed solar wind. These observations are discussed in the context of the proposed AGW mechanism as well as the previously suggested atmospheric electrical current (AEC) model (Tinsley et al., 1994), which requires the presence of stratospheric aerosols for a significant (Wilcox) effect.


Sign in / Sign up

Export Citation Format

Share Document