Robust regression with asymmetric loss functions

2021 ◽  
pp. 096228022110120
Author(s):  
Liya Fu ◽  
You-Gan Wang

In robust regression, it is usually assumed that the distribution of the error term is symmetric or the data are symmetrically contaminated by outliers. However, this assumption is usually not satisfied in practical problems, and thus if the traditional robust methods, such as Tukey’s biweight and Huber’s method, are used to estimate the regression parameters, the efficiency of the parameter estimation can be lost. In this paper, we construct an asymmetric Tukey’s biweight loss function with two tuning parameters and propose a data-driven method to find the most appropriate tuning parameters. Furthermore, we provide an adaptive algorithm to obtain robust and efficient parameter estimates. Our extensive simulation studies suggest that the proposed method performs better than the symmetric methods when error terms follow an asymmetric distribution or are asymmetrically contaminated. Finally, a cardiovascular risk factors dataset is analyzed to illustrate the proposed method.

Author(s):  
Umran Munire Kahraman ◽  
Neslihan Iyit

In this study, performances of LAD regression, M-regression, Q25 and Q75 quantile regression models as robust regression methods alternative to the classical LS method are compared in the case of violations from the normality assumption of the error terms and the presence of an outlier. By using these alternative regression methods, stock prices of the 12 commercial banks and 1 participation bank listed in the Istanbul Stock Exchange (BIST) bank index between 2012 and 2016 are investigated in terms of equity size and equity profitability. As a result of this study, M-regression is the most suitable robust regression model with the smallest value of the mean squared error (MSE) measure and the small values for the standard errors of the parameter estimates belonging to the equity size and equity profitability. The smaller the standard errors of the parameter estimates, the narrower the resulting confidence intervals are obtained in M- regression. The accuracy as a measure of closeness of parameter estimates to the true values of the parameters is also obtained higher in M-regression.


2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Shivaji Shripati Desai ◽  
D. N. Kashid

Support vector machine (SVM) is used for estimation of regression parameters to modify the sum of cross products (Sp). It works well for some nonnormal error distributions. The performance of existing robust methods and the modified Sp is evaluated through simulated and real data. The results show the performance of the modified Sp is good.


1999 ◽  
Vol 15 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Lutz F. Hornke

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.


2021 ◽  
Vol 3 (6) ◽  
Author(s):  
Ogbonnaya Anicho ◽  
Philip B. Charlesworth ◽  
Gurvinder S. Baicher ◽  
Atulya K. Nagar

AbstractThis work analyses the performance of Reinforcement Learning (RL) versus Swarm Intelligence (SI) for coordinating multiple unmanned High Altitude Platform Stations (HAPS) for communications area coverage. It builds upon previous work which looked at various elements of both algorithms. The main aim of this paper is to address the continuous state-space challenge within this work by using partitioning to manage the high dimensionality problem. This enabled comparing the performance of the classical cases of both RL and SI establishing a baseline for future comparisons of improved versions. From previous work, SI was observed to perform better across various key performance indicators. However, after tuning parameters and empirically choosing suitable partitioning ratio for the RL state space, it was observed that the SI algorithm still maintained superior coordination capability by achieving higher mean overall user coverage (about 20% better than the RL algorithm), in addition to faster convergence rates. Though the RL technique showed better average peak user coverage, the unpredictable coverage dip was a key weakness, making SI a more suitable algorithm within the context of this work.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Aydin Shishegaran ◽  
Behnam Karami ◽  
Elham Safari Danalou ◽  
Hesam Varaee ◽  
Timon Rabczuk

Purpose The resistance of steel plate shear walls (SPSW) under explosive loads is evaluated using nonlinear FE analysis and surrogate methods. This study uses the conventional weapons effect program (CONWEP) model for the explosive load and the Johnson-Cook model for the steel plate. Based on the Taguchi method, 25 samples out of 100 samples are selected for a parametric study where we predict the damaged zones and the maximum deflection of SPSWs under explosive loads. Then, this study uses a multiple linear regression (MLR), multiple Ln equation regression (MLnER), gene expression programming (GEP), adaptive network-based fuzzy inference (ANFIS) and an ensemble model to predict the maximum detection of SPSWs. Several statistical parameters and error terms are used to evaluate the accuracy of the different surrogate models. The results show that the cross-section in the y-direction and the plate thickness have the most significant effects on the maximum deflection of SPSWs. The results also show that the maximum deflection is related to the scaled distance, i.e. for a value of 0.383. The ensemble model performs better than all other models for predicting the maximum deflection of SPSWs under explosive loads. Design/methodology/approach The SPSW under explosive loads is evaluated using nonlinear FE analysis and surrogate methods. This study uses the CONWEP model for the explosive load and the Johnson-Cook model for the steel plate. Based on the Taguchi method, 25 samples out of 100 samples are selected for a parametric study where we predict the damaged zones and the maximum deflection of SPSWs under explosive loads. Then, this study uses a MLR, MLnER, GEP, ANFIS and an ensemble model to predict the maximum detection of SPSWs. Several statistical parameters and error terms are used to evaluate the accuracy of the different surrogate models. The results show that the cross-section in the y-direction and the plate thickness have the most significant effects on the maximum deflection of SPSWs. The results also show that the maximum deflection is related to the scaled distance, i.e. for a value of 0.383. The ensemble model performs better than all other models for predicting the maximum deflection of SPSWs under explosive loads. Findings The resistance of SPSW under explosive loads is evaluated using nonlinear FE analysis and surrogate methods. This study uses the CONWEP model for the explosive load and the Johnson-Cook model for the steel plate. Based on the Taguchi method, 25 samples out of 100 samples are selected for a parametric study where we predict the damaged zones and the maximum deflection of SPSWs under explosive loads. Then, this study uses a MLR, MLnER, GEP, ANFIS and an ensemble model to predict the maximum detection of SPSWs. Several statistical parameters and error terms are used to evaluate the accuracy of the different surrogate models. The results show that the cross-section in the y-direction and the plate thickness have the most significant effects on the maximum deflection of SPSWs. The results also show that the maximum deflection is related to the scaled distance, i.e. for a value of 0.383. The ensemble model performs better than all other models for predicting the maximum deflection of SPSWs under explosive loads. Originality/value The resistance of SPSW under explosive loads is evaluated using nonlinear FE analysis and surrogate methods. This study uses the CONWEP model for the explosive load and the Johnson-Cook model for the steel plate. Based on the Taguchi method, 25 samples out of 100 samples are selected for a parametric study where we predict the damaged zones and the maximum deflection of SPSWs under explosive loads. Then, this study uses a MLR, MLnER, GEP, ANFIS and an ensemble model to predict the maximum detection of SPSWs. Several statistical parameters and error terms are used to evaluate the accuracy of the different surrogate models. The results show that the cross-section in the y-direction and the plate thickness have the most significant effects on the maximum deflection of SPSWs. The results also show that the maximum deflection is related to the scaled distance, i.e. for a value of 0.383. The ensemble model performs better than all other models for predicting the maximum deflection of SPSWs under explosive loads.


2019 ◽  
Vol 8 (1) ◽  
pp. 24-34
Author(s):  
Eka Destiyani ◽  
Rita Rahmawati ◽  
Suparti Suparti

The Ordinary Least Squares (OLS) is one of the most commonly used method to estimate linear regression parameters. If multicollinearity is exist within predictor variables especially coupled with the outliers, then regression analysis with OLS is no longer used. One method that can be used to solve a multicollinearity and outliers problems is Ridge Robust-MM Regression. Ridge Robust-MM  Regression is a modification of the Ridge Regression method based on the MM-estimator of Robust Regression. The case study in this research is AKB in Central Java 2017 influenced by population dencity, the precentage of households behaving in a clean and healthy life, the number of low-weighted baby born, the number of babies who are given exclusive breastfeeding, the number of babies that receiving a neonatal visit once, and the number of babies who get health services. The result of estimation using OLS show that there is violation of multicollinearity and also the presence of outliers. Applied ridge robust-MM regression to case study proves ridge robust regression can improve parameter estimation. Based on t test at 5% significance level most of predictor variables have significant effect to variable AKB. The influence value of predictor variables to AKB is 47.68% and MSE value is 0.01538.Keywords:  Ordinary  Least  Squares  (OLS),  Multicollinearity,  Outliers,  RidgeRegression, Robust Regression, AKB.


2013 ◽  
Vol 4 (2) ◽  
Author(s):  
Yan-Xia Lin ◽  
Phillip Wise

This paper considers the scenario that all data entries in a confidentialised unit record file were masked by multiplicative noises, regardless of whether unit records are sensitive or not and regardless of whether the masked variables are dependent or independent variables in the underlying regression analysis. A technique is introduced in this paper to show how to estimate parameters in a regression model, which is originally fitted by unmasked data, based on masked data. Several simulation studies and a real-life data application are presented.


2014 ◽  
Vol 71 (1) ◽  
Author(s):  
Bello Abdulkadir Rasheed ◽  
Robiah Adnan ◽  
Seyed Ehsan Saffari ◽  
Kafi Dano Pati

In a linear regression model, the ordinary least squares (OLS) method is considered the best method to estimate the regression parameters if the assumptions are met. However, if the data does not satisfy the underlying assumptions, the results will be misleading. The violation for the assumption of constant variance in the least squares regression is caused by the presence of outliers and heteroscedasticity in the data. This assumption of constant variance (homoscedasticity) is very important in linear regression in which the least squares estimators enjoy the property of minimum variance. Therefor e robust regression method is required to handle the problem of outlier in the data. However, this research will use the weighted least square techniques to estimate the parameter of regression coefficients when the assumption of error variance is violated in the data. Estimation of WLS is the same as carrying out the OLS in a transformed variables procedure. The WLS can easily be affected by outliers. To remedy this, We have suggested a strong technique for the estimation of regression parameters in the existence of heteroscedasticity and outliers. Here we apply the robust regression of M-estimation using iterative reweighted least squares (IRWLS) of Huber and Tukey Bisquare function and resistance regression estimator of least trimmed squares to estimating the model parameters of state-wide crime of united states in 1993. The outcomes from the study indicate the estimators obtained from the M-estimation techniques and the least trimmed method are more effective compared with those obtained from the OLS.


Entropy ◽  
2020 ◽  
Vol 22 (4) ◽  
pp. 399 ◽  
Author(s):  
Marco Riani ◽  
Anthony C. Atkinson ◽  
Aldo Corbellini ◽  
Domenico Perrotta

Minimum density power divergence estimation provides a general framework for robust statistics, depending on a parameter α , which determines the robustness properties of the method. The usual estimation method is numerical minimization of the power divergence. The paper considers the special case of linear regression. We developed an alternative estimation procedure using the methods of S-estimation. The rho function so obtained is proportional to one minus a suitably scaled normal density raised to the power α . We used the theory of S-estimation to determine the asymptotic efficiency and breakdown point for this new form of S-estimation. Two sets of comparisons were made. In one, S power divergence is compared with other S-estimators using four distinct rho functions. Plots of efficiency against breakdown point show that the properties of S power divergence are close to those of Tukey’s biweight. The second set of comparisons is between S power divergence estimation and numerical minimization. Monitoring these two procedures in terms of breakdown point shows that the numerical minimization yields a procedure with larger robust residuals and a lower empirical breakdown point, thus providing an estimate of α leading to more efficient parameter estimates.


1984 ◽  
Vol 21 (3) ◽  
pp. 268-277 ◽  
Author(s):  
Vijay Mahajan ◽  
Subhash Sharma ◽  
Yoram Wind

In marketing models, the presence of aberrant response values or outliers in data can distort the parameter estimates or regression coefficients obtained by means of ordinary least squares. The authors demonstrate the potential usefulness of the robust regression analysis in treating influential response values in marketing data.


Sign in / Sign up

Export Citation Format

Share Document