scholarly journals Ideal point error for model assessment

2012 ◽  
Vol 9 (2) ◽  
pp. 1671-1698
Author(s):  
C. W. Dawson ◽  
R. J. Abrahart ◽  
A. Y. Shamseldin ◽  
N. J. Mount

Abstract. When analysing the performance of hydrological models, researchers use a number of diverse statistics. Although some statistics appear to be used more regularly in such analyses than others, there is a distinct lack of consistency in evaluation, making studies undertaken by different authors or performed at different locations difficult to compare in a meaningful manner. Moreover, even within individual reported case studies, substantial contradictions are found to occur between one measure of performance and another. In this paper we examine the Ideal Point Error (IPE) metric ‐ a recently introduced measure of model performance that integrates a number of recognised metrics in a logical way. Having a single, integrated measure of performance is appealing as it should permit more straightforward model inter-comparisons. However, IPE relies on the adoption of a consistent and recognised benchmarking system. This paper examines one potential option for benchmarking IPE: the use of "persistence scenarios".

2012 ◽  
Vol 16 (8) ◽  
pp. 3049-3060 ◽  
Author(s):  
C. W. Dawson ◽  
N. J. Mount ◽  
R. J. Abrahart ◽  
A. Y. Shamseldin

Abstract. When analysing the performance of hydrological models in river forecasting, researchers use a number of diverse statistics. Although some statistics appear to be used more regularly in such analyses than others, there is a distinct lack of consistency in evaluation, making studies undertaken by different authors or performed at different locations difficult to compare in a meaningful manner. Moreover, even within individual reported case studies, substantial contradictions are found to occur between one measure of performance and another. In this paper we examine the ideal point error (IPE) metric – a recently introduced measure of model performance that integrates a number of recognised metrics in a logical way. Having a single, integrated measure of performance is appealing as it should permit more straightforward model inter-comparisons. However, this is reliant on a transferrable standardisation of the individual metrics that are combined to form the IPE. This paper examines one potential option for standardisation: the use of naive model benchmarking.


2018 ◽  
Vol 20 (6) ◽  
pp. 1387-1400
Author(s):  
Yiqun Sun ◽  
Weimin Bao ◽  
Peng Jiang ◽  
Xuying Wang ◽  
Chengmin He ◽  
...  

Abstract The dynamic system response curve (DSRC) has its origin in correcting model variables of hydrologic models to improve the accuracy of flood prediction. The DSRC method can lead to unstable performance since the least squares (LS) method, employed by DSRC to estimate the errors, often breaks down for ill-posed problems. A previous study has shown that under certain assumptions the DSRC method can be regarded as a specific form of the numerical solution of the Fredholm equation of the first kind, which is a typical ill-posed problem. This paper introduces the truncated singular value decomposition (TSVD) to propose an improved version of the DSRC method (TSVD-DSRC). The proposed method is extended to correct the initial conditions of a conceptual hydrological model. The usefulness of the proposed method is first demonstrated via a synthetic case study where both the perturbed initial conditions, the true initial conditions, and the corrected initial conditions are precisely known. Then the proposed method is used in two real basins. The results measured by two different criteria clearly demonstrate that correcting the initial conditions of hydrological models has significantly improved the model performance. Similar good results are obtained for the real case study.


2017 ◽  
Vol 28 (1) ◽  
pp. 309-320 ◽  
Author(s):  
Scott Powers ◽  
Valerie McGuire ◽  
Leslie Bernstein ◽  
Alison J Canchola ◽  
Alice S Whittemore

Personal predictive models for disease development play important roles in chronic disease prevention. The performance of these models is evaluated by applying them to the baseline covariates of participants in external cohort studies, with model predictions compared to subjects' subsequent disease incidence. However, the covariate distribution among participants in a validation cohort may differ from that of the population for which the model will be used. Since estimates of predictive model performance depend on the distribution of covariates among the subjects to which it is applied, such differences can cause misleading estimates of model performance in the target population. We propose a method for addressing this problem by weighting the cohort subjects to make their covariate distribution better match that of the target population. Simulations show that the method provides accurate estimates of model performance in the target population, while un-weighted estimates may not. We illustrate the method by applying it to evaluate an ovarian cancer prediction model targeted to US women, using cohort data from participants in the California Teachers Study. The methods can be implemented using open-source code for public use as the R-package RMAP (Risk Model Assessment Package) available at http://stanford.edu/~ggong/rmap/ .


Risks ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 204
Author(s):  
Chamay Kruger ◽  
Willem Daniel Schutte ◽  
Tanja Verster

This paper proposes a methodology that utilises model performance as a metric to assess the representativeness of external or pooled data when it is used by banks in regulatory model development and calibration. There is currently no formal methodology to assess representativeness. The paper provides a review of existing regulatory literature on the requirements of assessing representativeness and emphasises that both qualitative and quantitative aspects need to be considered. We present a novel methodology and apply it to two case studies. We compared our methodology with the Multivariate Prediction Accuracy Index. The first case study investigates whether a pooled data source from Global Credit Data (GCD) is representative when considering the enrichment of internal data with pooled data in the development of a regulatory loss given default (LGD) model. The second case study differs from the first by illustrating which other countries in the pooled data set could be representative when enriching internal data during the development of a LGD model. Using these case studies as examples, our proposed methodology provides users with a generalised framework to identify subsets of the external data that are representative of their Country’s or bank’s data, making the results general and universally applicable.


Author(s):  
Yuqiao Yang ◽  
Xiaoqiang Lin ◽  
Geng Lin ◽  
Zengfeng Huang ◽  
Changjian Jiang ◽  
...  

In this paper, we explore to learn representations of legislation and legislator for the prediction of roll call results. The most popular approach for this topic is named the ideal point model that relies on historical voting information for representation learning of legislators. It largely ignores the context information of the legislative data. We, therefore, propose to incorporate context information to learn dense representations for both legislators and legislation. For legislators, we incorporate relations among them via graph convolutional neural networks (GCN) for their representation learning. For legislation, we utilize its narrative description via recurrent neural networks (RNN) for representation learning. In order to align two kinds of representations in the same vector space, we introduce a triplet loss for the joint training. Experimental results on a self-constructed dataset show the effectiveness of our model for roll call results prediction compared to some state-of-the-art baselines.


Author(s):  
Rodric Mérimé Nonki ◽  
André Lenouo ◽  
Christopher J. Lennard ◽  
Raphael M. Tshimanga ◽  
Clément Tchawoua

AbstractPotential Evapotranspiration (PET) plays a crucial role in water management, including irrigation systems design and management. It is an essential input to hydrological models. Direct measurement of PET is difficult, time-consuming and costly, therefore a number of different methods are used to compute this variable. This study compares the two sensitivity analysis approaches generally used for PET impact assessment on hydrological model performance. We conducted the study in the Upper Benue River Basin (UBRB) located in northern Cameroon using two lumped-conceptual rainfall-runoff models and nineteen PET estimation methods. A Monte-Carlo procedure was implemented to calibrate the hydrological models for each PET input while considering similar objective functions. Although there were notable differences between PET estimation methods, the hydrological models performance was satisfactory for each PET input in the calibration and validation periods. The optimized model parameters were significantly affected by the PET-inputs, especially the parameter responsible to transform PET into actual ET. The hydrological models performance was insensitive to the PET input using a dynamic sensitivity approach, while he was significantly affected using a static sensitivity approach. This means that the over-or under-estimation of PET is compensated by the model parameters during the model recalibration. The model performance was insensitive to the rescaling PET input for both dynamic and static sensitivities approaches. These results demonstrate that the effect of PET input to model performance is necessarily dependent on the sensitivity analysis approach used and suggest that the dynamic approach is more effective for hydrological modeling perspectives.


2017 ◽  
Vol 47 ◽  
pp. 113-144

At the outset of Plato's Timaeus, Socrates briefly recalls the discussion of the ideal state which he had the day before with his companions (Tim. 17c1–19b2). Looking back at it, he experiences what people often experience when they see beautiful creatures in repose: he wants to see them in motion (19b3–c2). This is precisely the goal of the present chapter. The previous one has provided a general overview of several essential themes and characteristics of the Parallel Lives. Now, it is time to see them ‘in motion’.


Author(s):  
Alexander Mansutti Rodríguez

Mansutti adopts Needham’s scheme of distinguishing three analytical levels; the jural rules; the statistical-behavioural, and the categorical. He includes a computer simulation to gain a time depth in his model of Piaroa kinship and marriage, which demonstrates that the exceptions to the marriage rules are not residual and inexplicable but are necessary to maintain the ideal model anticipated by the formal rules. Although violations of the rules are motivated by personal desires and not a desire to save the formal system, they are in fact necessary to its preservation. Moreover, he employs Bourdieu’s distinction between official and private kinship, and illustrates his approach with apposite case studies. For instance, he describes how personal interests can be transmuted into community interests, and genealogical relationships between two people in small-scale societies can be “read” along different routes with telling results.


2018 ◽  
Vol 22 (4) ◽  
pp. 2163-2185 ◽  
Author(s):  
Stefan Liersch ◽  
Julia Tecklenburg ◽  
Henning Rust ◽  
Andreas Dobler ◽  
Madlen Fischer ◽  
...  

Abstract. Climate simulations are the fuel to drive hydrological models that are used to assess the impacts of climate change and variability on hydrological parameters, such as river discharges, soil moisture, and evapotranspiration. Unlike with cars, where we know which fuel the engine requires, we never know in advance what unexpected side effects might be caused by the fuel we feed our models with. Sometimes we increase the fuel's octane number (bias correction) to achieve better performance and find out that the model behaves differently but not always as was expected or desired. This study investigates the impacts of projected climate change on the hydrology of the Upper Blue Nile catchment using two model ensembles consisting of five global CMIP5 Earth system models and 10 regional climate models (CORDEX Africa). WATCH forcing data were used to calibrate an eco-hydrological model and to bias-correct both model ensembles using slightly differing approaches. On the one hand it was found that the bias correction methods considerably improved the performance of average rainfall characteristics in the reference period (1970–1999) in most of the cases. This also holds true for non-extreme discharge conditions between Q20 and Q80. On the other hand, bias-corrected simulations tend to overemphasize magnitudes of projected change signals and extremes. A general weakness of both uncorrected and bias-corrected simulations is the rather poor representation of high and low flows and their extremes, which were often deteriorated by bias correction. This inaccuracy is a crucial deficiency for regional impact studies dealing with water management issues and it is therefore important to analyse model performance and characteristics and the effect of bias correction, and eventually to exclude some climate models from the ensemble. However, the multi-model means of all ensembles project increasing average annual discharges in the Upper Blue Nile catchment and a shift in seasonal patterns, with decreasing discharges in June and July and increasing discharges from August to November.


Sign in / Sign up

Export Citation Format

Share Document