scholarly journals Assessing the Performance of Random Forests for Modeling Claim Severity in Collision Car Insurance

Risks ◽  
2021 ◽  
Vol 9 (3) ◽  
pp. 53
Author(s):  
Yves Staudt ◽  
Joël Wagner

For calculating non-life insurance premiums, actuaries traditionally rely on separate severity and frequency models using covariates to explain the claims loss exposure. In this paper, we focus on the claim severity. First, we build two reference models, a generalized linear model and a generalized additive model, relying on a log-normal distribution of the severity and including the most significant factors. Thereby, we relate the continuous variables to the response in a nonlinear way. In the second step, we tune two random forest models, one for the claim severity and one for the log-transformed claim severity, where the latter requires a transformation of the predicted results. We compare the prediction performance of the different models using the relative error, the root mean squared error and the goodness-of-lift statistics in combination with goodness-of-fit statistics. In our application, we rely on a dataset of a Swiss collision insurance portfolio covering the loss exposure of the period from 2011 to 2015, and including observations from 81 309 settled claims with a total amount of CHF 184 mio. In the analysis, we use the data from 2011 to 2014 for training and from 2015 for testing. Our results indicate that the use of a log-normal transformation of the severity is not leading to performance gains with random forests. However, random forests with a log-normal transformation are the favorite choice for explaining right-skewed claims. Finally, when considering all indicators, we conclude that the generalized additive model has the best overall performance.

2018 ◽  
Vol 18 (5-6) ◽  
pp. 483-504 ◽  
Author(s):  
Marius Ötting ◽  
Roland Langrock ◽  
Christian Deutscher

Recent years have seen several match-fixing scandals in soccer. In order to avoid match-fixing, existing literature and fraud detection systems primarily focus on analysing betting odds provided by bookmakers. In our work, we suggest to not only analyse odds but also total volume placed on bets, thereby making use of more of the information available. As a case study for our method, we consider the second division in Italian soccer, Serie B, since for this league it has effectively been proven that some matches were fixed, such that to some extent we can ground truth our approach. For the betting volume data, we use a flexible generalized additive model for location, scale and shape (GAMLSS), with log-normal response, to account for the various complex patterns present in the data. For the betting odds, we use a GAMLSS with bivariate Poisson response to model the number of goals scored by both teams, and to subsequently derive the corresponding odds. We then conduct outlier detection in order to flag suspicious matches. Our results indicate that monitoring both betting volumes and betting odds can lead to more reliable detection of suspicious matches.


2021 ◽  
Vol 11 ◽  
pp. 34-41
Author(s):  
N. Vivekanandan

Assessment of low-flow is an important aspect for water quality management, reservoir storage design, determining minimum release policy and safe surface water withdrawals. For which, the annual minimum d-day average flow is generally adopted procedure for characterizing the low-flow in a stream, which can be obtained by averaging the flow using moving average method for ‘d’ consecutive days viz., 7-, 10-, 14- and 30- days. This paper presents a study on comparison of three probability distributions such as Generalized Extreme Value, 2-parameter Log Normal (LN2) and Weibull adopted in estimation of low-flow for river Cauvery at Kollegal gauging site. The parameters are determined by three methods viz., method of moments, maximum likelihood method and L-Moments (LMO), and are used for estimation of low-flow. The adequacy of fitting probability distributions adopted in low-flow frequency analysis is evaluated by quantitative assessment through Goodness-of-Fit (viz., Chi-Square and Kolmogorov-Smirnov) and diagnostic (viz., correlation coefficient and root mean squared error) tests, and qualitative assessment using the fitted curves of the estimated low-flow. The results of quantitative and qualitative assessments indicate that LN2 (LMO) is better suited amongst three distributions adopted in estimation of 7-, 10-, 14- and 30- day low-flows for river Cauvery at Kollegal site.


2020 ◽  
Vol 9 (1) ◽  
pp. 84-88
Author(s):  
Govinda Prasad Dhungana ◽  
Laxmi Prasad Sapkota

 Hemoglobin level is a continuous variable. So, it follows some theoretical probability distribution Normal, Log-normal, Gamma and Weibull distribution having two parameters. There is low variation in observed and expected frequency of Normal distribution in bar diagram. Similarly, calculated value of chi-square test (goodness of fit) is observed which is lower in Normal distribution. Furthermore, plot of PDFof Normal distribution covers larger area of histogram than all of other distribution. Hence Normal distribution is the best fit to predict the hemoglobin level in future.


2021 ◽  
Vol 12 (3) ◽  
pp. 102
Author(s):  
Jaouad Khalfi ◽  
Najib Boumaaz ◽  
Abdallah Soulmani ◽  
El Mehdi Laadissi

The Box–Jenkins model is a polynomial model that uses transfer functions to express relationships between input, output, and noise for a given system. In this article, we present a Box–Jenkins linear model for a lithium-ion battery cell for use in electric vehicles. The model parameter identifications are based on automotive drive-cycle measurements. The proposed model prediction performance is evaluated using the goodness-of-fit criteria and the mean squared error between the Box–Jenkins model and the measured battery cell output. A simulation confirmed that the proposed Box–Jenkins model could adequately capture the battery cell dynamics for different automotive drive cycles and reasonably predict the actual battery cell output. The goodness-of-fit value shows that the Box–Jenkins model matches the battery cell data by 86.85% in the identification phase, and 90.83% in the validation phase for the LA-92 driving cycle. This work demonstrates the potential of using a simple and linear model to predict the battery cell behavior based on a complex identification dataset that represents the actual use of the battery cell in an electric vehicle.


2019 ◽  
Vol 7 (1) ◽  
pp. 1597956
Author(s):  
Carlos Valencia ◽  
Sergio Cabrales ◽  
Laura Garcia ◽  
Juan Ramirez ◽  
Diego Calderona ◽  
...  

2020 ◽  
Vol 11 (1) ◽  
pp. 203
Author(s):  
Primož Jelušič ◽  
Andrej Ivanič ◽  
Samo Lubej

Efforts were made to predict and evaluate blast-induced ground vibrations and frequencies using an adaptive network-based fuzzy inference system (ANFIS), which has a fast-learning capability and the ability to capture the non-linear response during the blasting process. For this purpose, the ground vibrations generated by the blast in a tunnel tube were monitored at a residential building located directly above the tunnel tube. To investigate the usefulness of this approach, the prediction by the ANFIS was also compared to those by three of the most commonly used vibration predictors. The efficiency criteria chosen for the comparison between the predicted and actual data were the sum of squares due to error (SSE), the root mean squared error (RMSE), and the goodness of fit (R-squared and adjusted R-squared). The results show that the ANFIS prediction model performs better than the commonly used predictors.


Sign in / Sign up

Export Citation Format

Share Document