scholarly journals Classification Ratemaking Using Decision Tree in the Insurance Market of Bosnia and Herzegovina

2020 ◽  
Vol 15 (2) ◽  
pp. 124-139
Author(s):  
Amela Omerašević ◽  
Jasmina Selimović

AbstractThis paper investigates the impact of risk classification on life insurance ratemaking with particular reference to Bosnia and Herzegovina (BiH). The research is based on a sample of over eighteen thousand insurance policies for passenger vehicles collected over the period 2015-2020. In our empirical investigation we develop a standard risk model based on the application of Poisson Generalized linear models (GLM) for claims frequency estimate and Gamma GLM for claim severity estimate. The analysis reveals that GLM does not provide a reliable parameter estimates for Multi-level factor (MLF) categorical predictors. Although GLM is widely used method to deter insurance premiums, improvements of GLM by using the data mining methods identified in this paper may solve practical challenges for the risk models. The popularity of applying data mining methods in the actuarial community has been growing in recent years due to its efficiency and precision. These models are recommended to be considered in BiH and South East European region in general.

2016 ◽  
Vol 48 (1) ◽  
pp. 25-53 ◽  
Author(s):  
Patrizia Gigante ◽  
Liviana Picech ◽  
Luciano Sigalotti

AbstractWe consider a Tweedie's compound Poisson regression model with fixed and random effects, to describe the payment numbers and the incremental payments, jointly, in claims reserving. The parameter estimates are obtained within the framework of hierarchical generalized linear models, by applying the h-likelihood approach. Regression structures are allowed for the means and also for the dispersions. Predictions and prediction errors of the claims reserves are evaluated. Through the parameters of the distributions of the random effects, some external information (e.g. a development pattern of industry wide-data) can be incorporated into the model. A numerical example shows the impact of external data on the reserve and prediction error evaluations.


2018 ◽  
Vol 28 ◽  
pp. 01027
Author(s):  
Leszek Ośródka ◽  
Ewa Krajny ◽  
Marek Wojtylak

The paper presents an attempt to use selected data mining methods to determine the influence of a complex of meteorological conditions on the concentrations of PM10 (PM2.5) proffering the example of the regions of Silesia and Northern Moravia. The collection of standard meteorological data has been supplemented by increments and derivatives of measurable weather elements such as vertical pseudo-gradient of air temperature. The main objective was to develop a universal methodology for the assessment of these impacts, i.e. one that would be independent of the analysed pollution. The probability of occurrence (at a given location) of the assumed concentration level as exceeding the value of the specified distributional quintile was adopted as the discriminant of the incidence. As a result of the analyses conducted, incidences of elevated concentrations of air pollution particulate matter PM10 have been identified and the types of weather responsible for the emergence of such situations have also been determined.


2021 ◽  
Vol 66 (229) ◽  
pp. 7-35
Author(s):  
Snjezana Brkic ◽  
Radovan Kastratovic ◽  
Mirela Abidovic-Salkica

The paper aims to identify patterns and country-specific determinants of intra-industry trade (IIT) in agri-food products between Bosnia and Herzegovina (BiH) and other CEFTA 2006 parties in the period 2008-2018. The purpose of the paper is to contribute to filling the gap in the empirical literature on IIT of the South East European countries, especially in regard to non-manufacturing sectors. To investigate IIT intensity and structure the analysis employed Grubel- Lloyd indices and GHM methodology based on relative unit values. In order to examine the impact of various determinants on IIT in agri-food products, a random-effects Heckman selection model was estimated, following a sector-level approach in the analysis. The analysis indicates a lower level of IIT than expected and a strong dominance of its vertical type in all BiH bilateral relations within CEFTA 2006. The empirical results also suggest that the major determinants positively affecting IIT in agri-food products include the size of the trading economies, the similarity in their ethnic structure, membership in the common regional trade agreement, and common borders. By contrast, the results indicate that IIT is negatively affected by differences between the trading economies in terms of productivity and gross domestic product per capita.


Author(s):  
Dewi Shintya Lumbansiantar

Natural disaster is a natural event that is difficult to avoid and difficult to estimate the exact impact of natural disasters that can be fatalities, social environment, propety, losses, even distrubance to the community even though it is very likely to occur. As for the disasters that often occur in Indonesia including floods, landslides, tsunamis, earthquakes and volcanic eruptions. The lack of relief supplies provided by the Indonesian Red Cross (PMI) was caused by the absence of data on the need for assistance provided. Therefore it is necessary to analyze natural disaster data that has happened before to be used to predict the impact caused by natural disasters. Prediction of the amount of assistance needed can be done using data mining techniques, therefore this study amis to analyzenatural disaster data using data mining methods using the J48 algorithm. To analyze natural disastr data for prediction of the impact can be used by rapidminer testing so that the results can be in the form of a decision tree.Keywords: Data Mining, Natural Disaster Data, J48 Algorithm


Blood ◽  
2010 ◽  
Vol 116 (21) ◽  
pp. 2973-2973
Author(s):  
Brian Van Ness ◽  
Majda Haznadar ◽  
Gang Fang ◽  
Wen Wang ◽  
Vanja Paunic ◽  
...  

Abstract Abstract 2973 Disease risk and therapeutic outcomes are impacted by both tumor heterogeneity as well as germline variations found in the population. Multiple myeloma (MM) shows significant heterogeneity in genetic aberrations in tumor cells, that together with inherited polymorphisms, affects disease risk and therapeutic response. In order to identify the impact of genetic variations (SNPs) on MM we have developed a Bank On A Cure platform for examining 3404 SNPs, selected in 983 genes associated with pathways affecting cellular functions important in cancer. Using SNP data sets we sought to identify genetic interactions, beyond single univariate association analysis. The challenge was to use data mining methods that take into account relatively small cohorts of patients, in which false discovery rates typically exceed the power of the study. We report results from using novel computational approaches that efficiently identify higher order SNP interactions associated with disease risk as well as survival outcomes, while minimizing the false discovery rate. The BOAC SNP panel was used to develop a data base on 143 patients selected for short (<1yr) versus long (>3yr) survival in ECOG 9486 and SWOG 9321; as well as 247 newly diagnosed patients and equal number of controls for disease risk analysis. One algorithm developed employs a discriminative pattern mining approach in which defined pathway sets of SNPs are used in combination testing. A second algorithm used identified SNPs that had some association with outcome (survival or disease status); but demonstrated a significant increase in associations when examined in combinations – we refer to this as a p-value jump association. Variations in genes associated with cell cycle, apoptosis, drug metabolism, stress response and immunity reached very low p-values, and survived multiple comparison testing when analyzed in combinations associated with both survival (PFS) predictions as well as analysis of case-control disease risk. Some of the key genetic variations identified in various combinations, included: PTRB, PTEN, CDK5, XRCC4, GSTA4, GPX, DYPD, PCNA, CYP4F2, VEGF, PON1, ALK, and BAG3. The data mining methods and algorithms used, and specific combinations associated with risk and survival, will be presented. These results are being further validated in new cohorts, and functional implications of identified genetic variants are being investigated in HapMap cell lines. Disclosures: No relevant conflicts of interest to declare.


Methodology ◽  
2015 ◽  
Vol 11 (3) ◽  
pp. 89-99 ◽  
Author(s):  
Leslie Rutkowski ◽  
Yan Zhou

Abstract. Given a consistent interest in comparing achievement across sub-populations in international assessments such as TIMSS, PIRLS, and PISA, it is critical that sub-population achievement is estimated reliably and with sufficient precision. As such, we systematically examine the limitations to current estimation methods used by these programs. Using a simulation study along with empirical results from the 2007 cycle of TIMSS, we show that a combination of missing and misclassified data in the conditioning model induces biases in sub-population achievement estimates, the magnitude and degree to which can be readily explained by data quality. Importantly, estimated biases in sub-population achievement are limited to the conditioning variable with poor-quality data while other sub-population achievement estimates are unaffected. Findings are generally in line with theory on missing and error-prone covariates. The current research adds to a small body of literature that has noted some of the limitations to sub-population estimation.


Author(s):  
I.M. Burykin ◽  
◽  
G.N. Aleeva ◽  
R.Kh. Khafizianova ◽  
◽  
...  
Keyword(s):  

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Gerhard Müller ◽  
Manuela Bombana ◽  
Monika Heinzel-Gutenbrenner ◽  
Nikolaus Kleindienst ◽  
Martin Bohus ◽  
...  

Abstract Background Mental disorders are related to high individual suffering and significant socio-economic burdens. However, it remains unclear to what extent self-reported mental distress is related to individuals’ days of incapacity to work and their medical costs. This study aims to investigate the impact of self-reported mental distress for specific and non-specific days of incapacity to work and specific and non-specific medical costs over a two-year span. Method Within a longitudinal research design, 2287 study participants’ mental distress was assessed using the Hospital Anxiety and Depression Scale (HADS). HADS scores were included as predictors in generalized linear models with a Tweedie distribution with log link function to predict participants’ days of incapacity to work and medical costs retrieved from their health insurance routine data during the following two-year period. Results Current mental distress was found to be significantly related to the number of specific days absent from work and medical costs. Compared to participants classified as no cases by the HADS (2.6 days), severe case participants showed 27.3-times as many specific days of incapacity to work in the first year (72 days) and 10.3-times as many days in the second year (44 days), and resulted in 11.4-times more medical costs in the first year (2272 EUR) and 6.2-times more in the second year (1319 EUR). The relationship of mental distress to non-specific days of incapacity to work and non-specific medical costs was also significant, but mainly driven from specific absent days and specific medical costs. Our results also indicate that the prevalence of presenteeism is considerably high: 42% of individuals continued to go to work despite severe mental distress. Conclusions Our results show that self-reported mental distress, assessed by the HADS, is highly related to the days of incapacity to work and medical costs in the two-year period. Reducing mental distress by improving preventive structures for at-risk populations and increasing access to evidence-based treatments for individuals with mental disorders might, therefore, pay for itself and could help to reduce public costs.


Sign in / Sign up

Export Citation Format

Share Document