scholarly journals EM Estimation for Zero- and k-Inflated Poisson Regression Model

Computation ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 94
Author(s):  
Monika Arora ◽  
N. Rao Chaganty

Count data with excessive zeros are ubiquitous in healthcare, medical, and scientific studies. There are numerous articles that show how to fit Poisson and other models which account for the excessive zeros. However, in many situations, besides zero, the frequency of another count k tends to be higher in the data. The zero- and k-inflated Poisson distribution model (ZkIP) is appropriate in such situations The ZkIP distribution essentially is a mixture distribution of Poisson and degenerate distributions at points zero and k. In this article, we study the fundamental properties of this mixture distribution. Using stochastic representation, we provide details for obtaining parameter estimates of the ZkIP regression model using the Expectation–Maximization (EM) algorithm for a given data. We derive the standard errors of the EM estimates by computing the complete, missing, and observed data information matrices. We present the analysis of two real-life data using the methods outlined in the paper.

Author(s):  
Serge P. Hoogendoorn ◽  
Piet H. L. Bovy

Recently, a new statistical procedure was developed that enables fast, accurate, and robust estimation of composite headway distributions, such as Branston’s generalized queueing model (GQM). Until now, the new procedure had only been applied to aggregate vehicular flow. In this paper, the estimation procedure is extended to headway observations segregated according to vehicle type and period of the day. Consequently, the parameters of a new mixed-vehicle-type headway distribution model based on Branston’s headway model can be estimated. Distinction of vehicle type and sample periods provides additional insight into the plausibility of the headway distributions and parameter values, as well as into the car-following behavior of the distinct vehicle classes varying across the different periods. The estimation procedure was applied to traffic data collected on a two-lane rural road in the Netherlands. Comparison of the estimated headway distributions with real-life data shows that headway distributions can be realistically replicated with the Pearson-III-based mixed-vehicle-type GQM. Inter-pretable differences between the morning, noon, and evening sample periods and between passenger cars, unarticulated trucks, and articulated trucks are found. In addition, passenger-car equivalents for both articulated trucks and unarticulated trucks were determined from the parameter estimates.


2013 ◽  
Vol 4 (2) ◽  
Author(s):  
Yan-Xia Lin ◽  
Phillip Wise

This paper considers the scenario that all data entries in a confidentialised unit record file were masked by multiplicative noises, regardless of whether unit records are sensitive or not and regardless of whether the masked variables are dependent or independent variables in the underlying regression analysis. A technique is introduced in this paper to show how to estimate parameters in a regression model, which is originally fitted by unmasked data, based on masked data. Several simulation studies and a real-life data application are presented.


2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Adewale F. Lukman ◽  
B. M. Golam Kibria ◽  
Kayode Ayinde ◽  
Segun L. Jegede

Motivated by the ridge regression (Hoerl and Kennard, 1970) and Liu (1993) estimators, this paper proposes a modified Liu estimator to solve the multicollinearity problem for the linear regression model. This modification places this estimator in the class of the ridge and Liu estimators with a single biasing parameter. Theoretical comparisons, real-life application, and simulation results show that it consistently dominates the usual Liu estimator. Under some conditions, it performs better than the ridge regression estimators in the smaller MSE sense. Two real-life data are analyzed to illustrate the findings of the paper and the performances of the estimators assessed by MSE and the mean squared prediction error. The application result agrees with the theoretical and simulation results.


MATEMATIKA ◽  
2018 ◽  
Vol 34 (2) ◽  
pp. 365-380
Author(s):  
Sunday Samuel Bako ◽  
Mohd Bakri Adam ◽  
Anwar Fitrianto

Recent studies have shown that independent identical distributed Gaussian random variables is not suitable for modelling extreme values observed during extremal events. However, many real life data on extreme values are dependent and stationary rather than the conventional independent identically distributed data. We propose a stationary autoregressive (AR) process with Gumbel distributed innovation and characterise the short-term dependence among maxima of an (AR) process over a range of sample sizes with varying degrees of dependence. We estimate the maximum likelihood of the parameters of the Gumbel AR process and its residuals, and evaluate the performance of the parameter estimates. The AR process is fitted to the Gumbel-generalised Pareto (GPD) distribution and we evaluate the performance of the parameter estimates fitted to the cluster maxima and the original series. Ignoring the effect of dependence leads to overestimation of the location parameter of the Gumbel-AR (1) process. The estimate of the location parameter of the AR process using the residuals gives a better estimate. Estimate of the scale parameter perform marginally better for the original series than the residual estimate. The degree of clustering increases as dependence is enhance for the AR process. The Gumbel-AR(1) fitted to the Gumbel-GPD shows that the estimates of the scale and shape parameters fitted to the cluster maxima perform better as sample size increases, however, ignoring the effect of dependence lead to an underestimation of the parameter estimates of the scale parameter. The shape parameter of the original series gives a superior estimate compare to the threshold excesses fitted to the Gumbel-GPD.


2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Muhammad Tahir ◽  
Ibrahim M. Almanjahie ◽  
Muhammad Abid ◽  
Ishfaq Ahmad

In this study, we model a heterogeneous population assuming the three-component mixture of the Pareto distributions assuming type I censored data. In particular, we study some statistical properties (such as various entropies, different inequality indices, and order statistics) of the three-component mixture distribution. The ML estimation and the Bayesian estimation of the mixture parameters have been performed in this study. For the ML estimation, we used the Newton Raphson method. To derive the posterior distributions, different noninformative priors are assumed to derive the Bayes estimators. Furthermore, we also discussed the Bayesian predictive intervals. We presented a detailed simulation study to compare the ML estimates and Bayes estimates. Moreover, we evaluated the performance of different estimates assuming various sample sizes, mixing weights and test termination times (a fixed point of time after which all other tests are dismissed). The real-life data application is also a part of this study.


2015 ◽  
Vol 72 (6) ◽  
pp. 1834-1847 ◽  
Author(s):  
Hugues P. Benoît ◽  
Connor W. Capizzano ◽  
Ryan J. Knotek ◽  
David B. Rudders ◽  
James A. Sulikowski ◽  
...  

Abstract Conservation concerns and new management policies such as the implementation of ecosystem-based approaches to fisheries management are motivating an increasing need for estimates of mortality associated with commercial fishery discards and released fish from recreational fisheries. Traditional containment studies and emerging techniques using electronic tags on fish released to the wild are producing longitudinal mortality-time data from which discard or release mortalities can be estimated, but where there may also be a need to account analytically for other sources of mortality. In this study, we present theoretical and empirical arguments for a parametric mixture-distribution model for discard mortality data. We show, analytically and using case studies for Atlantic cod (Gadus morhua), American plaice (Hippoglossoides platessoides), and winter skate (Leucoraja ocellata), how this model can easily be generalized to incorporate different characteristics of discard mortality data such as distinct capture, post-release and natural mortalities, and delayed mortality onset. In simulations over a range of conditions, the model provided reliable parameter estimates for cases involving both discard and natural mortality. These results support this modelling approach, indicating that it is well suited for data from studies in which fish are released to their natural environment. The model was found to be less reliable in simulations when there was a delay in discard mortality onset, though such an effect appears only in a minority of existing discard mortality studies. Overall, the model provides a flexible framework in which to analyse discard mortality data and to produce reliable scientific advice on discard mortality rates and possibilities for mitigation.


Sign in / Sign up

Export Citation Format

Share Document