scholarly journals A Command for Fitting Mixture Regression Models for Bounded Dependent Variables Using the Beta Distribution

Author(s):  
Laura A. Gray ◽  
Mónica Hernández Alava

In this article, we describe the betamix command, which fits mixture regression models for dependent variables bounded in an interval. The model is a generalization of the truncated inflated beta regression model introduced in Pereira, Botter, and Sandoval (2012, Communications in Statistics—Theory and Methods 41: 907–919) and the mixture beta regression model in Verkuilen and Smithson (2012, Journal of Educational and Behavioral Statistics 37: 82–113) for variables with truncated supports at either the top or the bottom of the distribution. betamix accepts dependent variables defined in any range that are then transformed to the interval (0, 1) before estimation.

2018 ◽  
Vol 19 (6) ◽  
pp. 617-633 ◽  
Author(s):  
Wagner H Bonat ◽  
Ricardo R Petterle ◽  
John Hinde ◽  
Clarice GB Demétrio

We propose a flexible class of regression models for continuous bounded data based on second-moment assumptions. The mean structure is modelled by means of a link function and a linear predictor, while the mean and variance relationship has the form [Formula: see text], where [Formula: see text], [Formula: see text] and [Formula: see text] are the mean, dispersion and power parameters respectively. The models are fitted by using an estimating function approach where the quasi-score and Pearson estimating functions are employed for the estimation of the regression and dispersion parameters respectively. The flexible quasi-beta regression model can automatically adapt to the underlying bounded data distribution by the estimation of the power parameter. Furthermore, the model can easily handle data with exact zeroes and ones in a unified way and has the Bernoulli mean and variance relationship as a limiting case. The computational implementation of the proposed model is fast, relying on a simple Newton scoring algorithm. Simulation studies, using datasets generated from simplex and beta regression models show that the estimating function estimators are unbiased and consistent for the regression coefficients. We illustrate the flexibility of the quasi-beta regression model to deal with bounded data with two examples. We provide an R implementation and the datasets as supplementary materials.


2016 ◽  
Vol 2016 ◽  
pp. 1-10 ◽  
Author(s):  
Chipo Mufudza ◽  
Hamza Erol

Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.


2017 ◽  
Vol 28 (3) ◽  
pp. 871-888 ◽  
Author(s):  
Abhik Ghosh

Data on rates, percentages, or proportions arise frequently in many different applied disciplines like medical biology, health care, psychology, and several others. In this paper, we develop a robust inference procedure for the beta regression model, which is used to describe such response variables taking values in (0, 1) through some related explanatory variables. In relation to the beta regression model, the issue of robustness has been largely ignored in the literature so far. The existing maximum likelihood-based inference has serious lack of robustness against outliers in data and generate drastically different (erroneous) inference in the presence of data contamination. Here, we develop the robust minimum density power divergence estimator and a class of robust Wald-type tests for the beta regression model along with several applications. We derive their asymptotic properties and describe their robustness theoretically through the influence function analyses. Finite sample performances of the proposed estimators and tests are examined through suitable simulation studies and real data applications in the context of health care and psychology. Although we primarily focus on the beta regression models with a fixed dispersion parameter, some indications are also provided for extension to the variable dispersion beta regression models with an application.


2011 ◽  
Vol 150 (1) ◽  
pp. 109-121 ◽  
Author(s):  
E. J. BELASCO ◽  
S. K. GHOSH

SUMMARYThe present paper develops a mixture regression model that allows for distributional flexibility in modelling the likelihood of a semi-continuous outcome that takes on zero value with positive probability while continuous on the positive half of the real line. A multivariate extension is also developed that builds on past multivariate models by systematically capturing the relationship between continuous and semi-continuous variables, while allowing for the semi-continuous variable to be characterized by a mixture model. The flexibility associated with this model provides potential applications in many production system studies. The empirical model is shown to provide a more accurate measure of mortality rates in cattle feedlots, both independently and within a system including other performance and health factors.


2019 ◽  
Author(s):  
Leili Tapak ◽  
Omid Hamidi ◽  
Majid Sadeghifar ◽  
Hassan Doosti ◽  
Ghobad Moradi

Abstract Objectives Zero-inflated proportion or rate data nested in clusters due to the sampling structure can be found in many disciplines. Sometimes, the rate response may not be observed for some study units because of some limitations (false negative) like failure in recording data and the zeros are observed instead of the actual value of the rate/proportions (low incidence). In this study, we proposed a multilevel zero-inflated censored Beta regression model that can address zero-inflation rate data with low incidence.Methods We assumed that the random effects are independent and normally distributed. The performance of the proposed approach was evaluated by application on a three level real data set and a simulation study. We applied the proposed model to analyze brucellosis diagnosis rate data and investigate the effects of climatic and geographical position. For comparison, we also applied the standard zero-inflated censored Beta regression model that does not account for correlation.Results Results showed the proposed model performed better than zero-inflated censored Beta based on AIC criterion. Height (p-value <0.0001), temperature (p-value <0.0001) and precipitation (p-value = 0.0006) significantly affected brucellosis rates. While, precipitation in ZICBETA model was not statistically significant (p-value =0.385). Simulation study also showed that the estimations obtained by maximum likelihood approach had reasonable in terms of mean square error.Conclusions The results showed that the proposed method can capture the correlations in the real data set and yields accurate parameter estimates.


Healthcare ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 525
Author(s):  
Samer A Kharroubi

Background: Typically, modeling of health-related quality of life data is often troublesome since its distribution is positively or negatively skewed, spikes at zero or one, bounded and heteroscedasticity. Objectives: In the present paper, we aim to investigate whether Bayesian beta regression is appropriate for analyzing the SF-6D health state utility scores and respondent characteristics. Methods: A sample of 126 Lebanese members from the American University of Beirut valued 49 health states defined by the SF-6D using the standard gamble technique. Three different models were fitted for SF-6D via Bayesian Markov chain Monte Carlo (MCMC) simulation methods. These comprised a beta regression, random effects and random effects with covariates. Results from applying the three Bayesian beta regression models were reported and compared based on their predictive ability to previously used linear regression models, using mean prediction error (MPE), root mean squared error (RMSE) and deviance information criterion (DIC). Results: For the three different approaches, the beta regression model was found to perform better than the normal regression model under all criteria used. The beta regression with random effects model performs best, with MPE (0.084), RMSE (0.058) and DIC (−1621). Compared to the traditionally linear regression model, the beta regression provided better predictions of observed values in the entire learning sample and in an out-of-sample validation. Conclusions: Beta regression provides a flexible approach to modeling health state values. It also accounted for the boundedness and heteroscedasticity of the SF-6D index scores. Further research is encouraged.


Sign in / Sign up

Export Citation Format

Share Document