ACTUARIAL APPLICATIONS OF WORD EMBEDDING MODELS

2019 ◽  
Vol 50 (1) ◽  
pp. 1-24
Author(s):  
Gee Y Lee ◽  
Scott Manski ◽  
Tapabrata Maiti

AbstractIn insurance analytics, textual descriptions of claims are often discarded, because traditional empirical analyses require numeric descriptor variables. This paper demonstrates how textual data can be easily used in insurance analytics. Using the concept of word similarities, we illustrate how to extract variables from text and incorporate them into claims analyses using standard generalized linear model or generalized additive regression model. This procedure is applied to the Wisconsin Local Government Property Insurance Fund (LGPIF) data, in order to demonstrate how insurance claims management and risk mitigation procedures can be improved. We illustrate two applications. First, we show how the claims classification problem can be solved using textual information. Second, we analyze the relationship between risk metrics and the probability of large losses. We obtain good results for both applications, where short textual descriptions of insurance claims are used for the extraction of features.

Author(s):  
О.Ю. Бушуева

Распространенные и зачастую сочетающиеся кардио- и цереброваскулярные заболевания (КЦВЗ), включающие артериальную гипертензию (АГ), ишемическую болезнь сердца (ИБС) и мозговой инсульт (МИ), представляют собой основную причину смертности во всем мире. Окислительный стресс имеет множество патологических эффектов на сосудистый гомеостаз и в настоящее время рассматривается как один из общих механизмов развития КЦВЗ. Целью исследования было изучение ассоциации однонуклеотидных полиморфизмов генов редокс-гомеостаза rs2070424 SOD1, rs4880 SOD2, rs769214 CAT, rs713041 GPX4, rs41303970 GCLM, rs17883901 GCLC, rs854560 PON1, rs7493 PON2, rs1695 GSTP1, rs2266782 FMO3 с развитием изолированных и сочетанных форм КЦВЗ. Материалом для исследования послужила выборка неродственных индивидов славянского происхождения, общей численностью 2702 человека. В исследование вошли 1815 пациентов с различными кардио- и цереброваскулярными заболеваниями и их сочетаниями: с изолированной АГ (иАГ), с изолированной ишемической болезнью сердца (иИБС), с сочетанием АГ и ИБС (АГ+ИБС), с мозговым инсультом (МИ) на фоне АГ (АГ+МИ); с коморбидной кардио- и цереброваскулярной патологией (АГ+ИБС+МИ). Из общей выборки здоровых лиц (N=887) были сформированы 5 контрольных групп, соответствующих по полу и возрасту каждой из групп нозологических форм заболеваний. Генотипирование SNP проводили методом ПЦР в режиме реального времени путем дискриминации аллелей с помощью TaqMan-зондов. Для анализа ассоциаций генотипов с развитием заболеваний пользовались лог-аддитивной регрессионной моделью. Все расчеты выполнены относительно минорного аллеля; введены поправки на пол и возраст. SNP rs1695 GSTP1 был связан исключительно с развитием иАГ (OR=1,19, 95%CI=1,01-1,39, р=0,034). SNP rs7493 PON2 был связан с развитием всех исследованных коморбидных кардио- и цереброваскулярных заболеваний: АГ+ИБС (adjOR=1,32, adj95%CI=1,07-1,63, adjp=0,01); АГ+МИ (adjOR=1,79, adj95%CI=1,45-2,21, adjp<0,0001); АГ+ИБС+МИ (adjOR=1,51, adj95%CI=1,09-2,09, adjp=0,01), а также с укорочением протромбинового времени (adjDifference=-0,35; adjp=0,01). SNP rs2266782 FMO3 был связан с фенотипом АГ+МИ (adjOR=1,24, adj95%CI=1,02-1,51, adjp=0,03), а также снижал возраст манифестации МИ (adjDifference=-2,31; adjp=0,03). Таким образом, установлено, что однонуклеотидные полиморфизмы генов редокс-гомеостаза могут представлять важную генетическую компоненту формирования дифференцированности кардио- и цереброваскулярных фенотипов. Common and often comorbid cardio- and cerebrovascular diseases (CCVD), including arterial hypertension (AH), coronary heart disease (CHD), and cerebral stroke (CS), are the leading cause of death worldwide. Oxidative stress has many pathological effects on vascular homeostasis and is currently regarded as one of the common mechanisms for the development of CCVD. The aim of our study was to investigate the association of single nucleotide polymorphisms of the redox-homeostasis genes rs2070424 SOD1, rs4880 SOD2, rs769214 CAT, rs713041 GPX4, rs41303970 GCLM, rs17883901 GCLC, rs854560 PON1, rs7493 PON2, rs1695 GSTP1, rs2266782 FMO3 with the development of isolated and comorbid CCVD. A total 2702 individuals of Slavic origin were included for this study. The patients group included 1815 subjects with various CCVD and their combinations: isolated AH (IAH); isolated IHD (IIHD), combination of AH and IHD (AH+IHD); combination of AH and CS (AH+CS); comorbid cardio- and cerebrovascular pathology (AH+IHD+CS). From the total sample of healthy individuals (N=887), 5 sex- and age-matched control groups were formed. Genotyping was performed using TaqMan-based PCR. To analyze the associations of genotypes with the risk of diseases, a log-additive regression model was used. All calculations were performed relative to the minor allele; corrections for gender and age have been introduced. SNP rs1695 GSTP1 was associated with IAH exclusively (OR=1.19, 95%CI=1.01-1.39, P=0.034). SNP rs7493 PON2 was associated with the development of all studied comorbid CCVD: AH+IHD (adjOR=1.32, adj95%CI=1.07-1.63, adjP=0.01); AH+CS (adjOR=1.79, adj95%CI=1.45-2.21, adjP<0.0001); AH+IHD+CS (adjOR=1.51, adj95%CI=1.09-2.09, adjP=0.01), as well as shortening of prothrombin time (adjDifference=-0.35; adjP=0.01). SNP rs2266782 FMO3 was associated with the development of AH+CS (adjOR=1.24, adj95%CI=1.02-1.51, adjP=0.03), as well as decreased age of manifestation of CS (adjDifference=-2.31; adjP=0.03). Thus, it was found that genes involved in regulation of redox-homeostasis, can represent an important genetic component in the formation of differentiation of cardio- and cerebrovascular phenotypes.


2020 ◽  
Vol 52 (3) ◽  
pp. 214-225
Author(s):  
ChiaKo Hung ◽  
Morgen S. Johansen ◽  
Jennifer Kagan ◽  
David Lee ◽  
Helen H. Yu

This essay provides a reflective commentary outlining Hawai’i’s unconventional response for employing a volunteer workforce of public servants when faced with the task of processing an unprecedented backlog of unemployment insurance claims triggered by the COVID-19 pandemic. Although efforts are still ongoing, this essay applies volunteerism and public service motivation as a framework to explain why public servants would serve in a voluntary capacity at another public agency. The intent of this essay is to spur conversation on how public servants are further stepping up to the frontlines during times of crisis, as well as expand knowledge on the relationship between volunteerism and public service motivation.


Author(s):  
Maurizio Romano ◽  
Francesco Mola ◽  
Claudio Conversano

The importance of the Word of Mouth is growing day by day in many topics. This phenomenon is evident in everyday life, e.g., the rise of influencers and social media managers. If more people positively debate specific products, then even more people are encouraged to buy them and vice versa. This effect is directly affected by the relationship between the potential customer and the reviewer. Moreover, considering the negative reporting bias is evident in how the Word of Mouth analysis is of absolute interest in many fields. We propose an algorithm to extract the sentiment from a natural language text corpus. The combined approach of Neural Networks, with high predictive power but more challenging interpretation, with more simple but informative models, allows us to quantify a sentiment with a numeric value and to predict if a sentence has a positive (negative) sentiment. The assessment of an objective quantity improves the interpretation of the results in many fields. For example, it is possible to identify crucial specific sectors that require intervention, improving the company's services whilst finding the strengths of the company himself (useful for advertising campaigns). Moreover, considering that the time information is usually available in textual data with a web origin, to analyze trends on macro/micro topics. After showing how to properly reduce the dimensionality of the textual data with a data-cleaning phase, we show how to combine: WordEmbedding, K-Means clustering, SentiWordNet, and the Threshold-based Naïve Bayes classifier. We apply this method to Booking.com and TripAdvisor.com data, analyzing the sentiment of people who discuss a particular issue, providing an example of customer satisfaction.


2021 ◽  
Author(s):  
Thomas Gläßle ◽  
Kerstin Rau ◽  
Thomas Scholten ◽  
Philipp Hennig

&lt;p&gt;Gaussian Processes provide a theoretically well-understood regression framework that is widely used in the context of Digital Soil Mapping. Among the reasons to use Gaussian Process Regression (GPR) are its interpretability, its builtin support for uncertainty quantification, and its ability to handle unevenly spaced and correlated training samples through a user-specified covariance kernel. The base case of GPR is performed with covariance models that are specified functions of Euclidean distance. In order to incorporate information other than the relative positions, regression-kriging extends GPR by an additive regression model of choice, and co-kriging considers a covariance model between covariates and the target variable. In this work, we use the alternative approach of incorporating topographic information directly into the kernel function by use of a non-Euclidean, non-stationary distance function. In particular, we devise kernels based on a path of least effort, where &lt;em&gt;effort&lt;/em&gt; is locally specified as a function constructed from prior knowledge. It can e.g. be derived from local topographic variables. We demonstrate that our candidate models improve prediction accuracy over the base model. This shows that domain knowledge can be integrated into the model by means of handcrafted kernel functions. The approach is not per se restricted to topographic variables, but could be used for any covariate quantity that is available at output resolution.&lt;/p&gt;


Animals ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. 2222
Author(s):  
Meredith Chapman ◽  
Matthew Thomas ◽  
Kirrilly Thompson

The equestrian industry reports high rates of serious injuries, illness and fatalities when compared to other high-risk sports and work environments. To address these ongoing safety concerns, a greater understanding of the relationship between human risk perception, values and safety behaviours is required. This paper presents results from an international survey that explored relationships between a respondents’ willingness to take risk during daily activities along with, their perceptions of risk and behaviours during horse-related interactions. Respondents’ comments around risk management principles and safety-first inspirations were also analysed. We examined what humans think about hazardous situations or activities and how they managed risk with suitable controls. Analysis identified three important findings. First, safe behaviours around horses were associated with safety training (formal and/or informal). Second, unsafe behaviours around horses were associated with higher levels of equestrian experience as well as income from horse-related work. Finally, findings revealed a general acceptance of danger and imminent injury during horse interactions. This may explain why some respondents de-emphasised or ‘talked-down’ the importance of safety-first principles. In this paper we predominantly reported quantitative findings of respondents self-reported safety behaviours, general and horse-related risk perceptions despite injury or illness. We discussed the benefits of improved safety-first principles like training, risk assessments, rider-horse match with enriched safety communications to enhance risk-mitigation during human–horse interactions.


Sign in / Sign up

Export Citation Format

Share Document