scholarly journals Application of bivariate negative binomial regression model in analysing insurance count data

2017 ◽  
Vol 11 (2) ◽  
pp. 390-411 ◽  
Author(s):  
Feng Liu ◽  
David Pitt

AbstractIn this paper we analyse insurance claim frequency data using the bivariate negative binomial regression (BNBR) model. We use general insurance data on claims from simple third-party liability insurance and comprehensive insurance. We find that bivariate regression, with its capacity for modelling correlation between the two observed claim counts, provides both a superior fit and out-of-sample prediction compared with the more common practice of fitting univariate negative binomial regression models separately to each claim type. Noting the complexity of BNBR models and their potential for a large number of parameters, we explore the use of model shrinkage methodology, namely the least absolute shrinkage and selection operator (Lasso) and ridge regression. We find that models estimated using shrinkage methods outperform the ordinary likelihood-based models when being used to make predictions out-of-sample. We find that the Lasso performs better than ridge regression as a method of shrinkage.

2019 ◽  
Vol 25 (110) ◽  
pp. 466
Author(s):  
سهيل نجم عبود ◽  
ايناس صلاح خورشيد

ناقش هذا البحث مقدر متحيز لأنموذج انحدار ثنائي الحدين السالب (Negative Binomial Regression Model) ومعرف بالمقدر ليو(Liu Estimator)، اذ استعمل هذا المقدر لتقليل التباين والتغلب على مشكلة التعدد الخطي بين المتغيرات التوضيحية، كما تم استخدام بعض التقديرات منها مقدر انحدار الحرف (Ridge Regression) ومقدر الامكان الاعظم (Maximum Likelihood)، اذ يهدف هذا البحث الى المقارنات النظرية بين مقدر (Liu Estimator) ومقدرات الامكان الاعظم (Maximum Likelihood) وانحدار الحرف (Ridge Regression) باستخدام معيار متوسط مربعات الخطأ (MSE)، اذ يكون تباين مقدر الامكان الاعظم (MLE) متضخم في ظل وجود مشكلة التعدد الخطي بين المتغيرات التوضيحية، وتم في هذا البحث تصميم المحاكاة (مونت كارلوا) لتقييم اداء المقدرات باستخدام معيار مقارنة متوسط مربعات الخطأ (MSE)، حيث اظهرت نتائج المحاكاة اهمية مقدر ليو وتفوقها على مقدري انحدار الحرف (RR) والامكان الاعظم (MLE) عندما يكون عدد المتغيرات التوضيحية (p=5)  ولحجم العينة (n=100)، اما عندما يكون عدد المتغيرات التوضيحية (p=3) ولكافة الحجوم، وكذلك عندما (p=5) ولكافة الحجوم ماعدا حجم العينة (n=100) طريقة انحدار الحرفRR  هي الافضل.  


2018 ◽  
Vol 24 (109) ◽  
pp. 515
Author(s):  
سهيل نجم عبود ◽  
ايناس صلاح خورشيد

ان مشكلة  التعدد الخطي من المشاكل الشائعة والتي تتعامل الى حد كبير مع الارتباط الداخلي  بين المتغيرات التوضيحية وتظهر هذه المشكلة خصوصا في الاقتصاد والبحوث التطبيقية، ويكون لمشكلة التعدد الخطي تاثير سلبي على أنموذج الانحدار مثل وجود درجة تباين متضخم وتقدير معلمات تكون غير مستقرة عندما نستخدم مقدرات المربعات الصغرى الاعتيادية (OLS) ، لهذا تم اللجوء الى استخدام طرائق اخرى لتقدير معلمات أنموذج ثنائي الحدين السالب منها طريقة مقدر انحدار الحرف ومقدر نوع ليو، ويعتبر أنموذج  انحدار ثنائي الحدين السالب (Negative Binomial Regression Model) كأنموذج انحدار غير خطي او كجزء من العائلة الاسية المعممة و هذا ألانموذج  الهيكل الاساسي لتحليل بيانات العد (Count Data) و الذي استخدم كبديل لنموذج بواسون عندما تكون هناك مشكلة فوق التشتت (Overdisperison)  اي عندما تكون قيمة تباين متغير الاستجابة (Y) اكبر من وسطه الحسابي ، وتم تصميم دراسة محاكاة مونت كارلوا للمقارنة بين طريقتي تقدير انحدار الحرف (Ridge Regression Estimator) ومقدر نوع ليو (Liu Type Estimator) من خلال استخدام معيار مقارنة متوسط مربعات الخطأ (MSE)، حيث بينت نتيجة المحاكاة ان طريقة مقدر نوع ليو هي افضل من طريقة مقدر انحدار الحرف  اذ جاءت متوسط مربعات الخطأ لها اقل في صيغته التقديرية الثالثة والرابعة .


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Hai-Yang Zhang ◽  
An-Ran Zhang ◽  
Qing-Bin Lu ◽  
Xiao-Ai Zhang ◽  
Zhi-Jie Zhang ◽  
...  

Abstract Background COVID-19 has impacted populations around the world, with the fatality rate varying dramatically across countries. Selenium, as one of the important micronutrients implicated in viral infections, was suggested to play roles. Methods An ecological study was performed to assess the association between the COVID-19 related fatality and the selenium content both from crops and topsoil, in China. Results Totally, 14,045 COVID-19 cases were reported from 147 cities during 8 December 2019–13 December 2020 were included. Based on selenium content in crops, the case fatality rates (CFRs) gradually increased from 1.17% in non-selenium-deficient areas, to 1.28% in moderate-selenium-deficient areas, and further to 3.16% in severe-selenium-deficient areas (P = 0.002). Based on selenium content in topsoil, the CFRs gradually increased from 0.76% in non-selenium-deficient areas, to 1.70% in moderate-selenium-deficient areas, and further to 1.85% in severe-selenium-deficient areas (P < 0.001). The zero-inflated negative binomial regression model showed a significantly higher fatality risk in cities with severe-selenium-deficient selenium content in crops than non-selenium-deficient cities, with incidence rate ratio (IRR) of 3.88 (95% CIs: 1.21–12.52), which was further confirmed by regression fitting the association between CFR of COVID-19 and selenium content in topsoil, with the IRR of 2.38 (95% CIs: 1.14–4.98) for moderate-selenium-deficient cities and 3.06 (1.49–6.27) for severe-selenium-deficient cities. Conclusions Regional selenium deficiency might be related to an increased CFR of COVID-19. Future studies are needed to explore the associations between selenium status and disease outcome at individual-level.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Ahmed Nabil Shaaban ◽  
Bárbara Peleteiro ◽  
Maria Rosario O. Martins

Abstract Background This study offers a comprehensive approach to precisely analyze the complexly distributed length of stay among HIV admissions in Portugal. Objective To provide an illustration of statistical techniques for analysing count data using longitudinal predictors of length of stay among HIV hospitalizations in Portugal. Method Registered discharges in the Portuguese National Health Service (NHS) facilities Between January 2009 and December 2017, a total of 26,505 classified under Major Diagnostic Category (MDC) created for patients with HIV infection, with HIV/AIDS as a main or secondary cause of admission, were used to predict length of stay among HIV hospitalizations in Portugal. Several strategies were applied to select the best count fit model that includes the Poisson regression model, zero-inflated Poisson, the negative binomial regression model, and zero-inflated negative binomial regression model. A random hospital effects term has been incorporated into the negative binomial model to examine the dependence between observations within the same hospital. A multivariable analysis has been performed to assess the effect of covariates on length of stay. Results The median length of stay in our study was 11 days (interquartile range: 6–22). Statistical comparisons among the count models revealed that the random-effects negative binomial models provided the best fit with observed data. Admissions among males or admissions associated with TB infection, pneumocystis, cytomegalovirus, candidiasis, toxoplasmosis, or mycobacterium disease exhibit a highly significant increase in length of stay. Perfect trends were observed in which a higher number of diagnoses or procedures lead to significantly higher length of stay. The random-effects term included in our model and refers to unexplained factors specific to each hospital revealed obvious differences in quality among the hospitals included in our study. Conclusions This study provides a comprehensive approach to address unique problems associated with the prediction of length of stay among HIV patients in Portugal.


Author(s):  
Hitesh Chawla ◽  
Megat-Usamah Megat-Johari ◽  
Peter T. Savolainen ◽  
Christopher M. Day

The objectives of this study were to assess the in-service safety performance of roadside culverts and evaluate the potential impacts of installing various safety treatments to mitigate the severity of culvert-involved crashes. Such crashes were identified using standard fields on police crash report forms, as well as through a review of pertinent keywords from the narrative section of these forms. These crashes were then linked to the nearest cross-drainage culvert, which was associated with the nearest road segment. A negative binomial regression model was then estimated to discern how the risk of culvert-involved crashes varied as a function of annual average daily traffic, speed limit, number of travel lanes, and culvert size and offset. The second stage of the analysis involved the use of the Roadside Safety Analysis Program to estimate the expected crash costs associated with various design contexts. A series of scenarios were evaluated, culminating in guidance as to the most cost-effective treatments for different combinations of roadway geometric and traffic characteristics. The results of this study provide an empirical model that can be used to predict the risk of culvert-involved crashes under various scenarios. The findings also suggest that the installation of safety grates on culvert openings provides a promising alternative for most of the cases where the culvert is located within the clear zone. In general, a guardrail is recommended when adverse conditions are present or when other treatments are not feasible at a specific location.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0254479
Author(s):  
Ta-Chien Chan ◽  
Jia-Hong Tang ◽  
Cheng-Yu Hsieh ◽  
Kevin J. Chen ◽  
Tsan-Hua Yu ◽  
...  

Background Sentinel physician surveillance in communities has played an important role in detecting early signs of epidemics. The traditional approach is to let the primary care physician voluntarily and actively report diseases to the health department on a weekly basis. However, this is labor-intensive work, and the spatio-temporal resolution of the surveillance data is not precise at all. In this study, we built up a clinic-based enhanced sentinel surveillance system named “Sentinel plus” which was designed for sentinel clinics and community hospitals to monitor 23 kinds of syndromic groups in Taipei City, Taiwan. The definitions of those syndromic groups were based on ICD-10 diagnoses from physicians. Methods Daily ICD-10 counts of two syndromic groups including ILI and EV-like syndromes in Taipei City were extracted from Sentinel plus. A negative binomial regression model was used to couple with lag structure functions to examine the short-term association between ICD counts and meteorological variables. After fitting the negative binomial regression model, residuals were further rescaled to Pearson residuals. We then monitored these daily standardized Pearson residuals for any aberrations from July 2018 to October 2019. Results The results showed that daily average temperature was significantly negatively associated with numbers of ILI syndromes. The ozone and PM2.5 concentrations were significantly positively associated with ILI syndromes. In addition, daily minimum temperature, and the ozone and PM2.5 concentrations were significantly negatively associated with the EV-like syndromes. The aberrational signals detected from clinics for ILI and EV-like syndromes were earlier than the epidemic period based on outpatient surveillance defined by the Taiwan CDC. Conclusions This system not only provides warning signals to the local health department for managing the risks but also reminds medical practitioners to be vigilant toward susceptible patients. The near real-time surveillance can help decision makers evaluate their policy on a timely basis.


Sign in / Sign up

Export Citation Format

Share Document