Robust Weighted Least Squares Estimation of Regression Parameter in the Presence of Outliers and Heteroscedastic Errors

2014 ◽  
Vol 71 (1) ◽  
Author(s):  
Bello Abdulkadir Rasheed ◽  
Robiah Adnan ◽  
Seyed Ehsan Saffari ◽  
Kafi Dano Pati

In a linear regression model, the ordinary least squares (OLS) method is considered the best method to estimate the regression parameters if the assumptions are met. However, if the data does not satisfy the underlying assumptions, the results will be misleading. The violation for the assumption of constant variance in the least squares regression is caused by the presence of outliers and heteroscedasticity in the data. This assumption of constant variance (homoscedasticity) is very important in linear regression in which the least squares estimators enjoy the property of minimum variance. Therefor e robust regression method is required to handle the problem of outlier in the data. However, this research will use the weighted least square techniques to estimate the parameter of regression coefficients when the assumption of error variance is violated in the data. Estimation of WLS is the same as carrying out the OLS in a transformed variables procedure. The WLS can easily be affected by outliers. To remedy this, We have suggested a strong technique for the estimation of regression parameters in the existence of heteroscedasticity and outliers. Here we apply the robust regression of M-estimation using iterative reweighted least squares (IRWLS) of Huber and Tukey Bisquare function and resistance regression estimator of least trimmed squares to estimating the model parameters of state-wide crime of united states in 1993. The outcomes from the study indicate the estimators obtained from the M-estimation techniques and the least trimmed method are more effective compared with those obtained from the OLS.

2009 ◽  
Vol 2009 ◽  
pp. 1-8 ◽  
Author(s):  
Janet Myhre ◽  
Daniel R. Jeske ◽  
Michael Rennie ◽  
Yingtao Bi

A heteroscedastic linear regression model is developed from plausible assumptions that describe the time evolution of performance metrics for equipment. The inherited motivation for the related weighted least squares analysis of the model is an essential and attractive selling point to engineers with interest in equipment surveillance methodologies. A simple test for the significance of the heteroscedasticity suggested by a data set is derived and a simulation study is used to evaluate the power of the test and compare it with several other applicable tests that were designed under different contexts. Tolerance intervals within the context of the model are derived, thus generalizing well-known tolerance intervals for ordinary least squares regression. Use of the model and its associated analyses is illustrated with an aerospace application where hundreds of electronic components are continuously monitored by an automated system that flags components that are suspected of unusual degradation patterns.


2019 ◽  
Vol 8 (1) ◽  
pp. 24-34
Author(s):  
Eka Destiyani ◽  
Rita Rahmawati ◽  
Suparti Suparti

The Ordinary Least Squares (OLS) is one of the most commonly used method to estimate linear regression parameters. If multicollinearity is exist within predictor variables especially coupled with the outliers, then regression analysis with OLS is no longer used. One method that can be used to solve a multicollinearity and outliers problems is Ridge Robust-MM Regression. Ridge Robust-MM  Regression is a modification of the Ridge Regression method based on the MM-estimator of Robust Regression. The case study in this research is AKB in Central Java 2017 influenced by population dencity, the precentage of households behaving in a clean and healthy life, the number of low-weighted baby born, the number of babies who are given exclusive breastfeeding, the number of babies that receiving a neonatal visit once, and the number of babies who get health services. The result of estimation using OLS show that there is violation of multicollinearity and also the presence of outliers. Applied ridge robust-MM regression to case study proves ridge robust regression can improve parameter estimation. Based on t test at 5% significance level most of predictor variables have significant effect to variable AKB. The influence value of predictor variables to AKB is 47.68% and MSE value is 0.01538.Keywords:  Ordinary  Least  Squares  (OLS),  Multicollinearity,  Outliers,  RidgeRegression, Robust Regression, AKB.


1986 ◽  
Vol 16 (2) ◽  
pp. 249-255 ◽  
Author(s):  
Edwin J. Green ◽  
William E. Strawderman

A Stein-rule estimator, which shrinks least squares estimates of regression parameters toward their weighted average, was employed to estimate the coefficient in the constant form factor volume equation for 18 species simultaneously. The Stein-rule procedure was applied to ordinary least squares estimates and weighted least squares estimates. Simulation tests on independent validation data sets revealed that the Stein-rule estimates were biased, but predicted better than the corresponding least squares estimates. The Stein-rule procedures also yielded lower estimated mean square errors for the volume equation coefficient than the corresponding least squares procedure. Different methods of withdrawing sample data from the total sample available for each species revealed that the superiority of Stein-rule procedures over least squares decreased as the sample size increased and that the Stein-rule procedures were robust to unequal sample sizes, at least on the scale studied here.


Author(s):  
Warha, Abdulhamid Audu ◽  
Yusuf Abbakar Muhammad ◽  
Akeyede, Imam

Linear regression is the measure of relationship between two or more variables known as dependent and independent variables. Classical least squares method for estimating regression models consist of minimising the sum of the squared residuals. Among the assumptions of Ordinary least squares method (OLS) is that there is no correlations (multicollinearity) between the independent variables. Violation of this assumptions arises most often in regression analysis and can lead to inefficiency of the least square method. This study, therefore, determined the efficient estimator between Least Absolute Deviation (LAD) and Weighted Least Square (WLS) in multiple linear regression models at different levels of multicollinearity in the explanatory variables. Simulation techniques were conducted using R Statistical software, to investigate the performance of the two estimators under violation of assumptions of lack of multicollinearity. Their performances were compared at different sample sizes. Finite properties of estimators’ criteria namely, mean absolute error, absolute bias and mean squared error were used for comparing the methods. The best estimator was selected based on minimum value of these criteria at a specified level of multicollinearity and sample size. The results showed that, LAD was the best at different levels of multicollinearity and was recommended as alternative to OLS under this condition. The performances of the two estimators decreased when the levels of multicollinearity was increased.


2019 ◽  
Vol 3 (3) ◽  
pp. 22
Author(s):  
Ilir Palla

This article contains the OLS method, WLS method and bootstrap methods to estimate coefficients of linear regression and their standard deviation. If regression holds random errors with constant variance and if those errors are independent normally distributed we can use least squares method, which is accurate for drawing inferences with these assumptions. If the errors are heteroscedastic, meaning that their variance depends from explanatory variable, or have different weights, we can’t use least squares method because this method cannot be safe for accurate results. If we know weights for each error, we can use weight least squares method. In this article we have also described bootstrap methods to evaluate regression parameters. The bootstrap methods improved quantile estimation. We simulated errors with non constant variances in a linear regression using R program and comparison results. Using this software we have found confidence interval, estimated coefficients, plots and results for any case.


2019 ◽  
Vol 8 (1) ◽  
pp. 81-92
Author(s):  
Dhea Kurnia Mubyarjati ◽  
Abdul Hoyyi ◽  
Hasbi Yasin

Multiple Linear Regression can be solved by using the Ordinary Least Squares (OLS). Some classic assumptions must be fulfilled namely normality, homoskedasticity, non-multicollinearity, and non-autocorrelation. However, violations of assumptions can occur due to outliers so the estimator obtained is biased and inefficient. In statistics, robust regression is one of method can be used to deal with outliers. Robust regression has several estimators, one of them is Scale estimator (S-estimator) used in this research. Case for this reasearch is fish production per district / city in Central Java in 2015-2016 which is influenced by the number of fishermen, number of vessels, number of trips, number of fishing units, and number of households / fishing companies. Approximate estimation with the Ordinary Least Squares occur in violation of the assumptions of normality, autocorrelation and homoskedasticity this occurs because there are outliers. Based on the t- test at 5% significance level can be concluded that several predictor variables there are the number of fishermen, the number of ships, the number of trips and the number of fishing units have a significant effect on the variables of fish production. The influence value of predictor variables to fish production is 88,006% and MSE value is 7109,519. GUI Matlab is program for robust regression for S-estimator to make it easier for users to do calculations. Keywords: Ordinary Least Squares (OLS), Outliers, Robust Regression, Fish Production, GUI Matlab.


2015 ◽  
Vol 31 (1) ◽  
pp. 61-75 ◽  
Author(s):  
Jianzhu Li ◽  
Richard Valliant

Abstract An extensive set of diagnostics for linear regression models has been developed to handle nonsurvey data. The models and the sampling plans used for finite populations often entail stratification, clustering, and survey weights, which renders many of the standard diagnostics inappropriate. In this article we adapt some influence diagnostics that have been formulated for ordinary or weighted least squares for use with stratified, clustered survey data. The statistics considered here include DFBETAS, DFFITS, and Cook's D. The differences in the performance of ordinary least squares and survey-weighted diagnostics are compared using complex survey data where the values of weights, response variables, and covariates vary substantially.


2021 ◽  
Vol 6 (11) ◽  
pp. 11850-11878
Author(s):  
SidAhmed Benchiha ◽  
◽  
Amer Ibrahim Al-Omari ◽  
Naif Alotaibi ◽  
Mansour Shrahili ◽  
...  

<abstract><p>Recently, a new lifetime distribution known as a generalized Quasi Lindley distribution (GQLD) is suggested. In this paper, we modified the GQLD and suggested a two parameters lifetime distribution called as a weighted generalized Quasi Lindley distribution (WGQLD). The main mathematical properties of the WGQLD including the moments, coefficient of variation, coefficient of skewness, coefficient of kurtosis, stochastic ordering, median deviation, harmonic mean, and reliability functions are derived. The model parameters are estimated by using the ordinary least squares, weighted least squares, maximum likelihood, maximum product of spacing's, Anderson-Darling and Cramer-von-Mises methods. The performances of the proposed estimators are compared based on numerical calculations for various values of the distribution parameters and sample sizes in terms of the mean squared error (MSE) and estimated values (Es). To demonstrate the applicability of the new model, four applications of various real data sets consist of the infected cases in Covid-19 in Algeria and Saudi Arabia, carbon fibers and rain fall are analyzed for illustration. It turns out that the WGQLD is empirically better than the other competing distributions considered in this study.</p></abstract>


2015 ◽  
Vol 76 (13) ◽  
Author(s):  
Khoo Li Peng ◽  
Robiah Adnan ◽  
Maizah Hura Ahmad

In this study, Leverage Based Near Neighbour–Robust Weighted Least Squares (LBNN-RWLS) method is proposed in order to estimate the standard error accurately in the presence of heteroscedastic errors and outliers in multiple linear regression. The data sets used in this study are simulated through monte carlo simulation. The data sets contain heteroscedastic errors and different percentages of outliers with different sample sizes.  The study discovered that LBNN-RWLS is able to produce smaller standard errors compared to Ordinary Least Squares (OLS), Least Trimmed of Squares (LTS) and Weighted Least Squares (WLS). This shows that LBNN-RWLS can estimate the standard error accurately even when heteroscedastic errors and outliers are present in the data sets.


Sign in / Sign up

Export Citation Format

Share Document