Multilevel Zero-inflated Censored Beta Regression Modeling for Proportions and Rate Data with Extra-zeros

2019 ◽  
Author(s):  
Leili Tapak ◽  
Omid Hamidi ◽  
Majid Sadeghifar ◽  
Hassan Doosti ◽  
Ghobad Moradi

Abstract Objectives Zero-inflated proportion or rate data nested in clusters due to the sampling structure can be found in many disciplines. Sometimes, the rate response may not be observed for some study units because of some limitations (false negative) like failure in recording data and the zeros are observed instead of the actual value of the rate/proportions (low incidence). In this study, we proposed a multilevel zero-inflated censored Beta regression model that can address zero-inflation rate data with low incidence.Methods We assumed that the random effects are independent and normally distributed. The performance of the proposed approach was evaluated by application on a three level real data set and a simulation study. We applied the proposed model to analyze brucellosis diagnosis rate data and investigate the effects of climatic and geographical position. For comparison, we also applied the standard zero-inflated censored Beta regression model that does not account for correlation.Results Results showed the proposed model performed better than zero-inflated censored Beta based on AIC criterion. Height (p-value <0.0001), temperature (p-value <0.0001) and precipitation (p-value = 0.0006) significantly affected brucellosis rates. While, precipitation in ZICBETA model was not statistically significant (p-value =0.385). Simulation study also showed that the estimations obtained by maximum likelihood approach had reasonable in terms of mean square error.Conclusions The results showed that the proposed method can capture the correlations in the real data set and yields accurate parameter estimates.

2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Ricardo R. Petterle ◽  
Wagner H. Bonat ◽  
Cassius T. Scarpin ◽  
Thaísa Jonasson ◽  
Victória Z. C. Borba

AbstractWe propose a multivariate regression model to deal with multiple continuous bounded data. The proposed model is based on second-moment assumptions, only. We adopted the quasi-score and Pearson estimating functions for estimation of the regression and dispersion parameters, respectively. Thus, the proposed approach does not require a multivariate probability distribution for the variable response vector. The multivariate quasi-beta regression model can easily handle multiple continuous bounded outcomes taking into account the correlation between the response variables. Furthermore, the model allows us to analyze continuous bounded data on the interval [0, 1], including zeros and/or ones. Simulation studies were conducted to investigate the behavior of the NORmal To Anything (NORTA) algorithm and to check the properties of the estimating function estimators to deal with multiple correlated response variables generated from marginal beta distributions. The model was motivated by a data set concerning the body fat percentage, which was measured at five regions of the body and represent the response variables. We analyze each response variable separately and compare it with the fit of the multivariate proposed model. The multivariate quasi-beta regression model provides better fit than its univariate counterparts, as well as allows us to measure the correlation between response variables. Finally, we adapted diagnostic tools to the proposed model. In the supplementary material, we provide the data set and R code.


2017 ◽  
Vol 28 (3) ◽  
pp. 871-888 ◽  
Author(s):  
Abhik Ghosh

Data on rates, percentages, or proportions arise frequently in many different applied disciplines like medical biology, health care, psychology, and several others. In this paper, we develop a robust inference procedure for the beta regression model, which is used to describe such response variables taking values in (0, 1) through some related explanatory variables. In relation to the beta regression model, the issue of robustness has been largely ignored in the literature so far. The existing maximum likelihood-based inference has serious lack of robustness against outliers in data and generate drastically different (erroneous) inference in the presence of data contamination. Here, we develop the robust minimum density power divergence estimator and a class of robust Wald-type tests for the beta regression model along with several applications. We derive their asymptotic properties and describe their robustness theoretically through the influence function analyses. Finite sample performances of the proposed estimators and tests are examined through suitable simulation studies and real data applications in the context of health care and psychology. Although we primarily focus on the beta regression models with a fixed dispersion parameter, some indications are also provided for extension to the variable dispersion beta regression models with an application.


Mathematics ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. 1259 ◽  
Author(s):  
Henry Velasco ◽  
Henry Laniado ◽  
Mauricio Toro ◽  
Víctor Leiva ◽  
Yuhlong Lio

Both cell-wise and case-wise outliers may appear in a real data set at the same time. Few methods have been developed in order to deal with both types of outliers when formulating a regression model. In this work, a robust estimator is proposed based on a three-step method named 3S-regression, which uses the comedian as a highly robust scatter estimate. An intensive simulation study is conducted in order to evaluate the performance of the proposed comedian 3S-regression estimator in the presence of cell-wise and case-wise outliers. In addition, a comparison of this estimator with recently developed robust methods is carried out. The proposed method is also extended to the model with continuous and dummy covariates. Finally, a real data set is analyzed for illustration in order to show potential applications.


2010 ◽  
Vol 44-47 ◽  
pp. 3647-3651
Author(s):  
Hsu Chan Yao ◽  
Hsiang Chuan Liu ◽  
Yu Du Jheng

In this paper, an Monte Carlo simulation study method with 5- fold cross-validation MSE is used, a simulation experiment data and a real data set are conducted, for comparing the performances of a multiple linear regression model, a ridge regression model, and the Choquet integral regression model with respect to two well-known fuzzy measures, P-measure and λ-measure, and two new fuzzy measures proposed by authors’ previous works, L-measure and extensional L-measure, respectively. Both of the results show that the Choquet integral regression model with respect to extensional L-measure has the best performance.


Author(s):  
Olga Mikhaylovna Tikhonova ◽  
Alexander Fedorovich Rezchikov ◽  
Vladimir Andreevich Ivashchenko ◽  
Vadim Alekseevich Kushnikov

The paper presents the system of predicting the indicators of accreditation of technical universities based on J. Forrester mechanism of system dynamics. According to analysis of cause-and-effect relationships between selected variables of the system (indicators of accreditation of the university) there was built the oriented graph. The complex of mathematical models developed to control the quality of training engineers in Russian higher educational institutions is based on this graph. The article presents an algorithm for constructing a model using one of the simulated variables as an example. The model is a system of non-linear differential equations, the modelling characteristics of the educational process being determined according to the solution of this system. The proposed algorithm for calculating these indicators is based on the system dynamics model and the regression model. The mathematical model is constructed on the basis of the model of system dynamics, which is further tested for compliance with real data using the regression model. The regression model is built on the available statistical data accumulated during the period of the university's work. The proposed approach is aimed at solving complex problems of managing the educational process in universities. The structure of the proposed model repeats the structure of cause-effect relationships in the system, and also provides the person responsible for managing quality control with the ability to quickly and adequately assess the performance of the system.


2019 ◽  
Vol XVI (2) ◽  
pp. 1-11
Author(s):  
Farrukh Jamal ◽  
Hesham Mohammed Reyad ◽  
Soha Othman Ahmed ◽  
Muhammad Akbar Ali Shah ◽  
Emrah Altun

A new three-parameter continuous model called the exponentiated half-logistic Lomax distribution is introduced in this paper. Basic mathematical properties for the proposed model were investigated which include raw and incomplete moments, skewness, kurtosis, generating functions, Rényi entropy, Lorenz, Bonferroni and Zenga curves, probability weighted moment, stress strength model, order statistics, and record statistics. The model parameters were estimated by using the maximum likelihood criterion and the behaviours of these estimates were examined by conducting a simulation study. The applicability of the new model is illustrated by applying it on a real data set.


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
K. S. Sultan ◽  
A. S. Al-Moisheer

We discuss the two-component mixture of the inverse Weibull and lognormal distributions (MIWLND) as a lifetime model. First, we discuss the properties of the proposed model including the reliability and hazard functions. Next, we discuss the estimation of model parameters by using the maximum likelihood method (MLEs). We also derive expressions for the elements of the Fisher information matrix. Next, we demonstrate the usefulness of the proposed model by fitting it to a real data set. Finally, we draw some concluding remarks.


2018 ◽  
Vol 19 (6) ◽  
pp. 617-633 ◽  
Author(s):  
Wagner H Bonat ◽  
Ricardo R Petterle ◽  
John Hinde ◽  
Clarice GB Demétrio

We propose a flexible class of regression models for continuous bounded data based on second-moment assumptions. The mean structure is modelled by means of a link function and a linear predictor, while the mean and variance relationship has the form [Formula: see text], where [Formula: see text], [Formula: see text] and [Formula: see text] are the mean, dispersion and power parameters respectively. The models are fitted by using an estimating function approach where the quasi-score and Pearson estimating functions are employed for the estimation of the regression and dispersion parameters respectively. The flexible quasi-beta regression model can automatically adapt to the underlying bounded data distribution by the estimation of the power parameter. Furthermore, the model can easily handle data with exact zeroes and ones in a unified way and has the Bernoulli mean and variance relationship as a limiting case. The computational implementation of the proposed model is fast, relying on a simple Newton scoring algorithm. Simulation studies, using datasets generated from simplex and beta regression models show that the estimating function estimators are unbiased and consistent for the regression coefficients. We illustrate the flexibility of the quasi-beta regression model to deal with bounded data with two examples. We provide an R implementation and the datasets as supplementary materials.


2018 ◽  
Vol 28 (9) ◽  
pp. 2868-2875
Author(s):  
Zhongxue Chen ◽  
Qingzhong Liu ◽  
Kai Wang

Several gene- or set-based association tests have been proposed recently in the literature. Powerful statistical approaches are still highly desirable in this area. In this paper we propose a novel statistical association test, which uses information of the burden component and its complement from the genotypes. This new test statistic has a simple null distribution, which is a special and simplified variance-gamma distribution, and its p-value can be easily calculated. Through a comprehensive simulation study, we show that the new test can control type I error rate and has superior detecting power compared with some popular existing methods. We also apply the new approach to a real data set; the results demonstrate that this test is promising.


Sign in / Sign up

Export Citation Format

Share Document