scholarly journals Robust Three-Step Regression Based on Comedian and Its Performance in Cell-Wise and Case-Wise Outliers

Mathematics ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. 1259 ◽  
Author(s):  
Henry Velasco ◽  
Henry Laniado ◽  
Mauricio Toro ◽  
Víctor Leiva ◽  
Yuhlong Lio

Both cell-wise and case-wise outliers may appear in a real data set at the same time. Few methods have been developed in order to deal with both types of outliers when formulating a regression model. In this work, a robust estimator is proposed based on a three-step method named 3S-regression, which uses the comedian as a highly robust scatter estimate. An intensive simulation study is conducted in order to evaluate the performance of the proposed comedian 3S-regression estimator in the presence of cell-wise and case-wise outliers. In addition, a comparison of this estimator with recently developed robust methods is carried out. The proposed method is also extended to the model with continuous and dummy covariates. Finally, a real data set is analyzed for illustration in order to show potential applications.

2019 ◽  
Author(s):  
Leili Tapak ◽  
Omid Hamidi ◽  
Majid Sadeghifar ◽  
Hassan Doosti ◽  
Ghobad Moradi

Abstract Objectives Zero-inflated proportion or rate data nested in clusters due to the sampling structure can be found in many disciplines. Sometimes, the rate response may not be observed for some study units because of some limitations (false negative) like failure in recording data and the zeros are observed instead of the actual value of the rate/proportions (low incidence). In this study, we proposed a multilevel zero-inflated censored Beta regression model that can address zero-inflation rate data with low incidence.Methods We assumed that the random effects are independent and normally distributed. The performance of the proposed approach was evaluated by application on a three level real data set and a simulation study. We applied the proposed model to analyze brucellosis diagnosis rate data and investigate the effects of climatic and geographical position. For comparison, we also applied the standard zero-inflated censored Beta regression model that does not account for correlation.Results Results showed the proposed model performed better than zero-inflated censored Beta based on AIC criterion. Height (p-value <0.0001), temperature (p-value <0.0001) and precipitation (p-value = 0.0006) significantly affected brucellosis rates. While, precipitation in ZICBETA model was not statistically significant (p-value =0.385). Simulation study also showed that the estimations obtained by maximum likelihood approach had reasonable in terms of mean square error.Conclusions The results showed that the proposed method can capture the correlations in the real data set and yields accurate parameter estimates.


2010 ◽  
Vol 44-47 ◽  
pp. 3647-3651
Author(s):  
Hsu Chan Yao ◽  
Hsiang Chuan Liu ◽  
Yu Du Jheng

In this paper, an Monte Carlo simulation study method with 5- fold cross-validation MSE is used, a simulation experiment data and a real data set are conducted, for comparing the performances of a multiple linear regression model, a ridge regression model, and the Choquet integral regression model with respect to two well-known fuzzy measures, P-measure and λ-measure, and two new fuzzy measures proposed by authors’ previous works, L-measure and extensional L-measure, respectively. Both of the results show that the Choquet integral regression model with respect to extensional L-measure has the best performance.


Mathematics ◽  
2020 ◽  
Vol 8 (10) ◽  
pp. 1786 ◽  
Author(s):  
A. M. Abd El-Raheem ◽  
M. H. Abu-Moussa ◽  
Marwa M. Mohie El-Din ◽  
E. H. Hafez

In this article, a progressive-stress accelerated life test (ALT) that is based on progressive type-II censoring is studied. The cumulative exposure model is used when the lifetime of test units follows Pareto-IV distribution. Different estimates as the maximum likelihood estimates (MLEs) and Bayes estimates (BEs) for the model parameters are discussed. Bayesian estimates are derived while using the Tierney and Kadane (TK) approximation method and the importance sampling method. The asymptotic and bootstrap confidence intervals (CIs) of the parameters are constructed. A real data set is analyzed in order to clarify the methods proposed through this paper. Two types of the progressive-stress tests, the simple ramp-stress test and multiple ramp-stress test, are compared through the simulation study. Finally, some interesting conclusions are drawn.


2017 ◽  
Vol 27 (11) ◽  
pp. 3207-3223 ◽  
Author(s):  
Thiago G Ramires ◽  
Gauss M Cordeiro ◽  
Michael W Kattan ◽  
Niel Hens ◽  
Edwin MM Ortega

Cure fraction models are useful to model lifetime data with long-term survivors. We propose a flexible four-parameter cure rate survival model called the log-sinh Cauchy promotion time model for predicting breast carcinoma survival in women who underwent mastectomy. The model can estimate simultaneously the effects of the explanatory variables on the timing acceleration/deceleration of a given event, the surviving fraction, the heterogeneity, and the possible existence of bimodality in the data. In order to examine the performance of the proposed model, simulations are presented to verify the robust aspects of this flexible class against outlying and influential observations. Furthermore, we determine some diagnostic measures and the one-step approximations of the estimates in the case-deletion model. The new model was implemented in the generalized additive model for location, scale and shape package of the R software, which is presented throughout the paper by way of a brief tutorial on its use. The potential of the new regression model to accurately predict breast carcinoma mortality is illustrated using a real data set.


2010 ◽  
Vol 44-47 ◽  
pp. 3579-3583 ◽  
Author(s):  
Hsiang Chuan Liu ◽  
Wei Sung Chen ◽  
Chin Chun Chen ◽  
Yu Du Jheng ◽  
Der Bang Wu

In this paper, a generalized multivalent fuzzy measure of extensional L-measure, called high order extensional L-measure, is proposed. It is proved that if the value of order index is equal to one, this new measure is just the extensional L-measure, and the larger the value of order index is, the more sensitive it is. A real data set with 5- fold cross-validation MSE is conducted, for comparing the performances of the Choquet integral regression model based on this new measure with other four measures, P-measure and λ-measure, and authors’ two measures, L-measure and extensional L-measure, and two traditional regression model, multiple regression model and ridge regression model, the result show that the Choquet integral regression model based on this new measure has the best performance.


Author(s):  
Asifa Mubeen ◽  
Nasir Jamal ◽  
Muhammad Hanif ◽  
Usman Shahzad

The main objective of the present study was to develop a new ridge regression estimator and fit the ridge regression model to the peanut production data of Pakistan. Peanut production data has been used to analyze the results. The data has been taken peanut production and growth rate of Pakistan. The mean square error of the proposed estimator is compared with some existing ridge regression estimators. In this study, we proposed a ridge regression estimator. The properties of proposed estimators are also discussed. The real data set of peanut production is used for assuming the performance of proposed and existing estimators. Numerical results of real data set show that proposed ridge regression estimator provides best results as compare to reviewed ones.


Author(s):  
Hani M. Samawi ◽  
Eman M. Tawalbeh

The performance of a regression estimator based on the double ranked set sample (DRSS) scheme, introduced by Al-Saleh and Al-Kadiri (2000), is investigated when the mean of the auxiliary variable X is unknown. Our primary analysis and simulation indicates that using the DRSS regression estimator for estimating the population mean substantially increases relative efficiency compared to using regression estimator based on simple random sampling (SRS) or ranked set sampling (RSS) (Yu and Lam, 1997) regression estimator.  Moreover, the regression estimator using DRSS is also more efficient than the naïve estimators of the population mean using SRS, RSS (when the correlation coefficient is at least 0.4) and DRSS for high correlation coefficient (at least 0.91.) The theory is illustrated using a real data set of trees.  


2021 ◽  
Vol 50 (5) ◽  
pp. 77-100
Author(s):  
Aidi khaoula ◽  
Sanku Dey ◽  
Devendra Kumar ◽  
Seddik-Ameur N

In this paper, we try to contribute to the distribution theory literature by incorporating a new bounded distribution, called the unit generalized inverse Weibull distribution (UGIWD) in the (0, 1) intervals by transformation method. The proposed distribution exhibits  increasing and bathtub shaped hazard rate function. We derive some basic statistical properties of the new distribution. Based on complete sample, the model parameters are obtained by the methods of maximum likelihood, least square, weighted least square, percentile, maximum product of spacing and Cram`er-von-Mises and compared them using Monte Carlo simulation study. In addition, bootstrap confidence intervals of the parameters of the model based on aforementioned methods of estimation are also obtained. We illustrate the performance of the proposed distribution by means of one real data set and the data set shows that the new distribution is more appropriate as compared to unit Birnbaum-Saunders, unit gamma, unit Weibull, Kumaraswamy and unit Burr III distributions. Further, we construct chi-squared goodness-of-fit tests for the UGIWD using right censored data based on Nikulin-Rao-Robson (NRR) statistic and its modification. The criterion test used is the modified chi-squared statistic Y^2, developedby Bagdonavi?ius and Nikulin, 2011 for some parametric models when data are censored. The performances of the proposed test are shown by an intensive simulation study and an application to real data set


2021 ◽  
Vol 50 (3) ◽  
pp. 859-867
Author(s):  
HABSHAH MIDI ◽  
SHELAN SAIED ISMAEEL ◽  
JAYANTHI ARASAN ◽  
MOHAMMED A MOHAMMED

It is now evident that some robust methods such as MM-estimator do not address the concept of bounded influence function, which means that their estimates still be affected by outliers in the X directions or high leverage points (HLPs), even though they have high efficiency and high breakdown point (BDP). The Generalized M(GM) estimator, such as the GM6 estimator is put forward with the main aim of making a bound for the influence of HLPs by some weight function. The limitation of GM6 is that it gives lower weight to both bad leverage points (BLPs) and good leverage points (GLPs) which make its efficiency decreases when more GLPs are present in a data set. Moreover, the GM6 takes longer computational time. In this paper, we develop a new version of GM-estimator which is based on simple and fast algorithm. The attractive feature of this method is that it only downs weights BLPs and vertical outliers (VOs) and increases its efficiency. The merit of our proposed GM estimator is studied by simulation study and well-known aircraft data set.


2018 ◽  
Vol 3 (1) ◽  
pp. 5-9 ◽  
Author(s):  
Roghaye Farhadi Hassankiadeh ◽  
Anoshirvan Kazemnejad ◽  
Mohammad Gholami Fesharaki ◽  
Siamak Kargar Jahromi ◽  
◽  
...  

Sign in / Sign up

Export Citation Format

Share Document