tweedie distribution
Recently Published Documents


TOTAL DOCUMENTS

25
(FIVE YEARS 12)

H-INDEX

4
(FIVE YEARS 2)

2022 ◽  
Vol 8 ◽  
Author(s):  
Xiangyu Long ◽  
Rong Wan ◽  
Zengguang Li ◽  
Dong Wang ◽  
Pengbo Song ◽  
...  

A fishery-independent survey can provide detailed information for fishery assessment and management. However, the sampling design for the survey on ichthyoplankton in the estuary area is still poorly understood. In this study, we developed six stratified schemes with various sample sizes, attempting to find cost-efficient sampling designs for monitoring Coilia mystus ichthyoplankton in the Yangtze Estuary. The generalized additive model (GAM) with the Tweedie distribution was used to quantify the “true” distribution of C. mystus eggs and larvae, based on the data from the fishery-independent survey in 2019–2020. The performances of different sampling designs were evaluated by relative estimation error (REE), relative bias (RB), and coefficient of variation (CV). The results indicated that appropriate stratifications with intra-stratum homogeneity and inter-stratum heterogeneity could improve precision. The stratified schemes should be divided not only between the North Branch and South Branch but between river and sea. No less than two stratifications in the South Branch could also get better performance. The sample sizes of 45–55 were considered as the cost-efficient range. Compared to other monitoring programs, monitoring ichthyoplankton in the estuary area required a more complex stratification and a higher resolution sampling. The design ideology and optimization methodology in our study would provide references to sampling designs for ichthyoplankton in the estuary area.


2021 ◽  
Vol 3 (2) ◽  
pp. 115-127
Author(s):  
Tri Andika Julia Putra ◽  
Donny Citra Lesmana ◽  
I Gusti Putu Purnaba

ABSTRAKSeorang aktuaris mempunyai tugas penting dalam menentukan harga premi yang sesuai untuk setiap nasabah dengan risiko dan karakteristik yang berbeda. Banyak variabel yang dapat mempengaruhi harga premi. Oleh karena itu, aktuaris harus mengetahui variabel-variabel yang berpengaruh signifikan terhadap premi. Tujuan dari penelitian ini adalah untuk menentukan variabel yang dapat mempengaruhi besaran premi murni menggunakan distribusi campuran dalam menentukan besarnya premi melalui Generalized Linear Models (GLM) serta menentukan model harga premi yang sesuai berdasarkan variabel-variabel yang mempengaruhinya. Salah satu analisis statistik yang dapat digunakan untuk memodelkan premi asuransi adalah Generalized Linear Models. GLM merupakan perluasan dari model regresi klasik yang dapat mengakomodasi fleksibilitas untuk menggunakan beberapa distribusi data tetapi terbatas pada distribusi keluarga eksponensial. Dalam model GLM, premi diperoleh dengan mengalikan nilai ekspektasi bersyarat dari frekuensi klaim dan biaya klaim. Berdasarkan penelitian yang telah dilakukan diketahui bahwa frekuensi klaim dan besarnya klaim mengikuti distribusi Tweedie. Dari kedua model tersebut diketahui bahwa variabel yang mempengaruhi premi murni adalah jumlah anak, pendapatan per bulan, status pernikahan, pendidikan, pekerjaan, penggunaan kendaraan, besarnya bluebook yang dibayarkan, dan jenis kendaraan nasabah. Hal ini menunjukkan bahwa model GLM merupakan model yang representatif dan berguna bagi perusahaan asuransi. ABSTRACTIt is an important task for an actuary in determining the appropriate premium price for each customer with different risks and characteristics. Many variables can affect the premium price. Therefore, actuaries must determine the variables that significantly affect the premium. The purpose of this study is to determine the variables that can affect the amount of pure premium using a mixed distribution in determining the amount of premium through Generalized Linear Models (GLM) and determine the appropriate premium price model based on the variables that influence it. One of the statistical analyzes that can be used to model insurance premiums is the Generalized Linear Models. GLM is an extension of the classic regression model that can accommodate the flexibility of its users to use multiple data distributions but is limited to the exponential family distribution. In the GLM model, the premium is obtained by multiplying the conditional expected value of the frequency of claims and the cost of claims. Based on the research that has been done, it is known that the frequency of claims and the size of claims follow the Tweedie distribution. From the two models, it is known that the variables affecting the pure premium are the number of children, monthly income, marital status, education, occupation, vehicle use, the number of bluebooks paid, and the type of vehicle from the customer. This shows that the GLM model is a representative and useful model for the insurance company business.


2021 ◽  
Author(s):  
Himel Mallick ◽  
Suvo Chatterjee ◽  
Shrabanti Chowdhury ◽  
Saptarshi Chatterjee ◽  
Ali Rahnavard ◽  
...  

SummaryThe performance of computational methods and software to identify differentially expressed genes in single-cell RNA-sequencing (scRNA-seq) has been shown to be influenced by several factors, including the choice of the normalization method used and the choice of the experimental platform (or library preparation protocol) to profile gene expression in individual cells. Currently, it is up to the practitioner to choose the most appropriate differential expression (DE) method out of over 100 DE tools available to date, each relying on their own assumptions to model scRNA-seq data. Here, we propose to use generalized linear models with the Tweedie distribution that can flexibly capture a large dynamic range of observed scRNA-seq data across experimental platforms induced by heavy tails, sparsity, or different count distributions to model the technological variability in scRNA-seq expression profiles. We also propose a zero-inflated Tweedie model that allows zero probability mass to exceed a traditional Tweedie distribution to model zero-inflated scRNA-seq data with excessive zero counts. Using both synthetic and published plate- and droplet-based scRNA-seq datasets, we performed a systematic benchmark evaluation of more than 10 representative DE methods and demonstrate that our method (Tweedieverse) outperforms the state-of-the-art DE approaches across experimental platforms in terms of statistical power and false discovery rate control. Our open-source software (R package) is available at https://github.com/himelmallick/Tweedieverse.


2020 ◽  
Vol 4 (3) ◽  
pp. 473-483
Author(s):  
Riza Indriani Rakhmalia ◽  
Agus M Soleh ◽  
Bagus Sartono

Rainfall prediction is one of the most challenging problems of the last century. Statistical Downscaling Technique is one of the rainfall estimation techniques that are often used. The goal of this paper is to develop the modeling of cluster-wise regression with rainfall data set that has Tweedie distribution. The data used in this paper were the precipitation from Climate Forecast System Reanalysis (CFSR) version 2 as the predictor variables and rainfall from BMKG as the response variable. Data were collected from January 2010 to December 2019 on the Bogor, Citeko, Jatiwangi, and Bandung rain posts. The best result of this study is a Cluster-wise Regression model with 4 clusters and using Tweedie distribution in each rain post. The best model was evaluated by the Root Mean Square Error Prediction. RMSEP value on Bogor rain post is 17.11 (three clusters), Citeko rain post 14.85 (two clusters), Jatiwangi rain post 15.26 (three clusters), and Bandung rain post 14.33 (two clusters). This model was able to make models and clusters well on daily rainfall application.


Author(s):  
Ngugi Mwenda ◽  
Ruth Nduati ◽  
Mathew Kosgey ◽  
Gregory Kerich

Background: Distance to a health facility for inpatient care in developing countries has been a huge hindrance towards the achievement of the Sustainable Development Goal three. The United Nation encourages countries to research on access to inpatient care, so as to form health policies based on data. Methods: Data on four hundred and eighty-one participants of all ages from forty-seven counties in Kenya who sought inpatient care in Kenya in 2018 were analyzed. Distance to a health facility was captured as a continuous variable and was self-reported by the respondent. The response exhibited a discrete mass at zero and continuous characteristic, therefore a Tweedie distribution was adopted for modelling. Due to the correlation nature of clustered data, we embraced the Generalized Estimating Equations approach with an exchangeable correlation. Since no standard software was available to analyze this problem, we developed an R functions. We assessed the best model fit using the QICu and criteria, in which the lowest value for the former and the highest for the later are preferred.Findings: Differences in employment, ability to pay for the service and household size are associated with the distance covered to access government facilities. Interpretation: Poor people tend to have large households and are more likely to live in rural areas and slums, thus are forced to travel for long distance to access inpatient care. Compared to unemployed, the employed could have better socio-economic status and possibly live within reach of the inpatient health facilities, therefore travel less distances to access. Longer distances are associated with high payments, signifying some form of specialized treatment care due to the complexity of the medical cases, that are expensive to treat.


2020 ◽  
Vol 34 (04) ◽  
pp. 4699-4706
Author(s):  
Tianbo Li ◽  
Yiping Ke

Self-exciting event sequences, in which the occurrence of an event increases the probability of triggering subsequent ones, are common in many disciplines. In this paper, we propose a Bayesian model called Tweedie-Hawkes Processes (THP), which is able to model the outbreaks of events and find out the dominant factors behind. THP leverages on the Tweedie distribution in capturing various excitation effects. A variational EM algorithm is developed for model inference. Some theoretical properties of THP, including the sub-criticality, convergence of the learning algorithm and kernel selection method are discussed. Applications to Epidemiology and information diffusion analysis demonstrate the versatility of our model in various disciplines. Evaluations on real-world datasets show that THP outperforms the rival state-of-the-art baselines in the task of forecasting future events.


2020 ◽  
Vol 137 ◽  
pp. 105456 ◽  
Author(s):  
Dibakar Saha ◽  
Priyanka Alluri ◽  
Eric Dumbaugh ◽  
Albert Gan

Author(s):  
Rezzy Eko CARAKA ◽  
Rung Ching CHEN ◽  
Toni TOHARUDIN ◽  
Isma Dwi KURNIAWAN ◽  
S Asmawati ◽  
...  
Keyword(s):  

2019 ◽  
Vol 164 ◽  
pp. 146-162 ◽  
Author(s):  
M.D. Jiménez-Gamero ◽  
M.V. Alba-Fernández
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document