LRMoE: An R Package for Flexible Actuarial Loss Modelling Using Mixture of Experts Regression Model

2020 ◽  
Author(s):  
Spark C. Tseung ◽  
Andrei Badescu ◽  
Tsz Chai Fung ◽  
Xiaodong Sheldon Lin
2021 ◽  
pp. 1-17
Author(s):  
Sen Hu ◽  
T. Brendan Murphy ◽  
Adrian O’Hagan

Abstract The mvClaim package in R provides flexible modelling frameworks for multivariate insurance claim severity modelling. The current version of the package implements a parsimonious mixture of experts (MoE) model family with bivariate gamma distributions, as introduced in Hu et al., and a finite mixture of copula regressions within the MoE framework as in Hu & O’Hagan. This paper presents the modelling approach theory briefly and the usage of the models in the package in detail. This package is hosted on GitHub at https://github.com/senhu/.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Yong Wu ◽  
Qigai Yin ◽  
Xiaobao Zhang ◽  
Pin Zhu ◽  
Hengfei Luan ◽  
...  

Background. Sepsis is a systemic inflammatory syndrome caused by infection with a high incidence and mortality. Although long noncoding RNAs have been identified to be closely involved in many inflammatory diseases, little is known about the role of lncRNAs in pediatric septic shock. Methods. We downloaded the mRNA profiles GSE13904 and GSE4607, of which GSE13904 includes 106 blood samples of pediatric patients with septic shock and 18 health control samples; GSE4607 includes 69 blood samples of pediatric patients with septic shock and 15 health control samples. The differentially expressed lncRNAs were identified through the limma R package; meanwhile, GO terms and KEGG pathway enrichment analysis was performed via the clusterProfiler R package. The protein-protein interaction (PPI) network was constructed based on the STRING database using the targets of differently expressed lncRNAs. The MCODE plug-in of Cytoscape was used to screen significant clustering modules composed of key genes. Finally, stepwise regression analysis was performed to screen the optimal lncRNAs and construct the logistic regression model, and the ROC curve was applied to evaluate the accuracy of the model. Results. A total of 13 lncRNAs which simultaneously exhibited significant differences in the septic shock group compared with the control group from two sets were identified. According to the 18 targets of differentially expressed lncRNAs, we identified some inflammatory and immune response-related pathways. In addition, several target mRNAs were predicted to be potentially involved in the occurrence of septic shock. The logistic regression model constructed based on two optimal lncRNAs THAP9-AS1 and TSPOAP1-AS1 could efficiently separate samples with septic shock from normal controls. Conclusion. In summary, a predictive model based on the lncRNAs THAP9-AS1 and TSPOAP1-AS1 provided novel lightings on diagnostic research of septic shock.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e8192 ◽  
Author(s):  
Gökhan Karakülah ◽  
Nazmiye Arslan ◽  
Cihangir Yandım ◽  
Aslı Suner

Introduction Recent studies highlight the crucial regulatory roles of transposable elements (TEs) on proximal gene expression in distinct biological contexts such as disease and development. However, computational tools extracting potential TE –proximal gene expression associations from RNA-sequencing data are still missing. Implementation Herein, we developed a novel R package, using a linear regression model, for studying the potential influence of TE species on proximal gene expression from a given RNA-sequencing data set. Our R package, namely TEffectR, makes use of publicly available RepeatMasker TE and Ensembl gene annotations as well as several functions of other R-packages. It calculates total read counts of TEs from sorted and indexed genome aligned BAM files provided by the user, and determines statistically significant relations between TE expression and the transcription of nearby genes under diverse biological conditions. Availability TEffectR is freely available at https://github.com/karakulahg/TEffectR along with a handy tutorial as exemplified by the analysis of RNA-sequencing data including normal and tumour tissue specimens obtained from breast cancer patients.


2021 ◽  
pp. 1-22
Author(s):  
Spark C. Tseung ◽  
Andrei L. Badescu ◽  
Tsz Chai Fung ◽  
X. Sheldon Lin

Abstract This paper introduces a new julia package, LRMoE, a statistical software tailor-made for actuarial applications, which allows actuarial researchers and practitioners to model and analyse insurance loss frequencies and severities using the Logit-weighted Reduced Mixture-of-Experts (LRMoE) model. LRMoE offers several new distinctive features which are motivated by various actuarial applications and mostly cannot be achieved using existing packages for mixture models. Key features include a wider coverage on frequency and severity distributions and their zero inflation, the flexibility to vary classes of distributions across components, parameter estimation under data censoring and truncation and a collection of insurance ratemaking and reserving functions. The package also provides several model evaluation and visualisation functions to help users easily analyse the performance of the fitted model and interpret the model in insurance contexts.


2020 ◽  
Author(s):  
Ben Artin ◽  
Daniel Weinberger ◽  
Virginia E. Pitzer ◽  
Joshua L Warren

There is often a need to estimate the characteristics of epidemics or seasonality from infectious disease data. For instance, accurately estimating the start and end date of respiratory syncytial virus (RSV) epidemics can be used to optimize the initiation of prophylactic medication. Many widely-used methods for describing these characteristics begin with a regression model fit to a time series of disease incidence. The fitted model is then often used to calculate the quantities of interest. Calculation of these quantities from the fitted regression model typically involves combining together different components of the fitted model, and consequently only point estimates (rather than measures of uncertainty) of those quantities can be made in a straightforward way. Motivated by attempts to estimate the optimal timing of prophylaxis for RSV, we developed a general method for obtaining confidence intervals for characteristics of seasonal and sporadic infectious disease outbreaks. To do this, we use multivariate sampling of a generalized additive model with penalized basis splines. Our approach provides robust confidence intervals regardless of the complexity of the calculations of the outcome measures, and it generalizes to other systems (including outbreaks of other infectious diseases). Here we present our general approach, its application to RSV, and an R package that provides a convenient interface for conducting and validating this type of analysis in other areas.


Sign in / Sign up

Export Citation Format

Share Document