scholarly journals A New Efficient Redescending M-Estimator for Robust Fitting of Linear Regression Models in the Presence of Outliers

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Dost Muhammad Khan ◽  
Muhammad Ali ◽  
Zubair Ahmad ◽  
Sadaf Manzoor ◽  
Sundus Hussain

Robust regression is an important iterative procedure that seeks analyzing data sets that are contaminated with outliers and unusual observations and reducing their impact over regression coefficients. Robust estimation methods have been introduced to deal with the problem of outliers and provide efficient and stable estimates in their presence. Various robust estimators have been developed in the literature to restrict the unbounded influence of the outliers or leverage points on the model estimates. Here, a new redescending M-estimator is proposed using a novel objective function with the prime focus on getting highly robust and efficient estimates that give promising results. It is evident from the results that, for normal and clean data, the proposed estimator is almost as efficient as ordinary least square method and, however, becomes highly resistant to outliers when it is used for contaminated datasets. The simulation study is being carried out to assess the performance of the proposed redescending M-estimator over different data generation scenarios including normal, t-distribution, and double exponential distributions with different levels of outliers’ contamination, and the results are compared with the existing redescending M-estimators, e.g., Huber, Tukey Biweight, Hampel, and Andrew-Sign function. The performance of the proposed estimators was also checked using real-life data applications of the estimators and found that the proposed estimators give promising results as compared to the existing estimators.

Author(s):  
Aamir Raza ◽  
Muhammad Noor-ul-Amin

The estimation of population mean is not meaningful using ordinary least square method when data contains some outliers. In the current study, we proposed efficient estimators of population mean using robust regression in two phase sampling. An extensive simulation study is conduct to examine the efficiency of proposed estimators in terms of mean square error (MSE). Real life example and extensive simulation study are cited to demonstrate the performance of the proposed estimators. Theoretical example and simulation studies showed that the suggested estimators are more efficient than the considered estimators in the presence of outliers.


2013 ◽  
Vol 278-280 ◽  
pp. 1323-1326
Author(s):  
Yan Hua Yu ◽  
Li Xia Song ◽  
Kun Lun Zhang

Fuzzy linear regression has been extensively studied since its inception symbolized by the work of Tanaka et al. in 1982. As one of the main estimation methods, fuzzy least squares approach is appealing because it corresponds, to some extent, to the well known statistical regression analysis. In this article, a restricted least squares method is proposed to fit fuzzy linear models with crisp inputs and symmetric fuzzy output. The paper puts forward a kind of fuzzy linear regression model based on structured element, This model has precise input data and fuzzy output data, Gives the regression coefficient and the fuzzy degree function determination method by using the least square method, studies the imitation degree question between the observed value and the forecast value.


2016 ◽  
Vol 5 (6) ◽  
pp. 10
Author(s):  
Serpil Kilic Depren ◽  
Özer Depren

Generalized Maximum Entropy (GME) approach is one of the alternative estimation methods for Regression Analysis. GME approach is superior to other classical approaches in terms of parameter estimation accuracy when some or none of the assumptions of classical approaches are violated. However, determining bounds of parameter support vectors is one of the open parts of this approach when researchers have no prior information about the parameters. If support vectors cannot be determined correctly, parameters estimations will not be obtained correctly. There are some theoretical studies about GME for different datasets in the literature, but there are fewer studies about how to determine parameter support vectors. To obtain robust parameter estimations in GME, we introduced a new iterative procedure for determining parameter support vectors bounds for multilevel dataset. In this study, the new iterative procedure was applied for multi-level random intercept model and the new procedure was tested both simulation study and the real life data. The Classical and the new procedures of GME estimations were compared to Generalized Least Square Estimations in terms of Root Mean Square Error (RMSE) statistics. As a result, the estimations of the new approach provided lower RMSE values than classical methods.


2014 ◽  
Vol 800-801 ◽  
pp. 208-213
Author(s):  
Hui Ping Zhang ◽  
Yi Nan Lai ◽  
Chong Xun Wang ◽  
Xu Du

Turning process properties of difficult-to-cut materials used in aeronautics are often associated with the machining accuracy and surface quality of aerospace structural parts. This study presents the influence of cutting velocity, feed rate and back cutting depth on cutting force and cutting temperature during dry turning of ultra-high strength 300M steel, where the linear regression models of cutting forces and cutting temperature are constructed by using least square method, and the regression coefficients of these models are verified by significance tests. Meanwhile, the temperature distribution and chip in turning machining are also achieved by finite element analysis.


Author(s):  
Hanan Haj AHmad ◽  
Ehab Almetwally

A new generalization of generalized Pareto Distribution is obtained using the generator Marshall-Olkin distribution (1997). The new distribution MOGP is more flexible and can be used to model non-monotonic failure rate functions. MOGP includes six different sub models: Generalized Pareto, Exponential, Uniform, Pareto type I, Marshall-Olkin Pareto and Marshall-Olkin exponential distribution. We consider different estimation procedures for estimating the model parameters, namely: Maximum likelihood estimator, Maximum product spacing, Least square method, weighted least square method and Bayesian Method. The Bayesian Method is considered under quadratic loss function and Linex loss function. Simulation analysis using MCMC technique is performed to compare between the proposed point estimation methods. The usefulness of MOGP is illustrated by means of real data set, which shows that this generalization is better fit than Pareto, GP and MOP distributions.


Author(s):  
Anam Javaid ◽  
Mohd. Tahir Ismail ◽  
M.K.M. Ali

The Internet of things ((IoT) consisted of physical devices networks such as sensors, home appliances, electronics, and software’s. It enables us to collect and exchange data in several fields. After data collection from IoT, variable selection is considered a major problem because many variables are involved in real life datasets. The current study focused on large data analysis of the problem of model selection, including interaction terms. The dataset used in this study is taken from solar drier with moisture ratio removal (%) as dependent variable while ambient temperature, chamber temperature, collector temperature, chamber relative humidity, ambient relative humidity, and solar radiation as independent variables. LASSO with Huber M, LASSO with Hampel M and LASSO with Bisquare M are proposed in this study. Comparison of proposed techniques are made with ridge regression and OLS (ordinary least square) after multicollinearity test and coefficient test. MAPE (mean absolute percentage error) is calculated for the efficient selected model to forecast. As a result, the model using LASSO with Bisquare-M provides a minimum MAPE value for the best efficient model. Thus, the resulting model with the selected variables can be used to predict Moisture Ratio Removal (%) to determine seaweed drying behavior.


2008 ◽  
Vol 22 (09n11) ◽  
pp. 1558-1563
Author(s):  
Q. SHEN ◽  
J. XU ◽  
C. X. CHEN ◽  
Z. H. YE ◽  
R. WANG

Obtaining the more exact parameters of slip surfaces is proved to be crucial in the analysis of landslide stability, and there are four study methods available i.e. in-site test, laboratory test, back analysis and theoretical analysis. In general, the former two are the basic one, whereas the latter two are secondary. A set of test values about shearing stress and direct stress can be obtained through in-situ and lab tests, then the parameters of slip surface should be calculated to fitting for the test value by the least square method usually, but the effect of anomalous values should increase markedly because the quadratic sum of residual errors is adopted in calculations. In order to reduce the effect of anomalous value, a new method based on the robust regression analysis and the particle swarm optimization method are adopted to calculate the mechanics parameter of landslides in this paper. In the new method, the sum of residual absolute values is used to replace the quadratic sum of residual error, and then the quadratic sum of anomalous value can be avoided. Comparing to the least square method, the new method can reduce availably the effect of abnormal values, and the results are more credible. Furthermore, an engineering example is cited to show the validity.


2017 ◽  
Vol 65 (2) ◽  
pp. 103-105
Author(s):  
Md Shariful Islam ◽  
Mir Shariful Islam ◽  
AFM Khodadad Khan ◽  
Md Zavid Iqbal Bangalee

Logistic dynamics are frequently encountered in real life problems, especially in population dynamics. Data showing an appearance to follow logistic model may be interpolated by standards methods in numerical analysis. In this paper we discuss a method to fit a curve to such data using the intrinsic analytic properties of the data in terms of least square method and graphic tools in the environment of Mathematica. Dhaka Univ. J. Sci. 65(2): 103-105, 2017 (July)


2022 ◽  
Vol 18 (2) ◽  
pp. 251-260
Author(s):  
Malecita Nur Atala Singgih ◽  
Achmad Fauzan

Crime incidents that occurred in Indonesia in 2019 based on Survey Based Data on criminal data sourced from the National Socio-Economic Survey and Village Potential Data Collection produced by the Central Statistics Agency recorded 269,324 cases. The high crime rate is caused by several factors, including poverty and population density. Determination of the most influential factors in criminal acts in Indonesia can be done with Regression Analysis. One method of Regression Analysis that is very commonly used is the Least Square Method. However, Regression Analysis can be used if the assumption test is met. If outliers are found, then the assumption test is not completed. The outlier problem can be overcome by using a robust estimation method. This study aims to determine the best estimation method between Maximum Likelihood Type (M) estimation, Scale (S) estimation, and Method of Moment (MM) estimation on Robust Regression. The best estimate of Robust Regression is the smallest Residual Standard Error (RSE) value and the largest Adjusted R-square. The analysis of case studies of criminal acts in Indonesia in 2019 showed that the best estimate was the S estimate with an RSE value of 4226 and an Adjusted R-square of 0.98  


2020 ◽  
Vol 2020 (1) ◽  
Author(s):  
M. Radha ◽  
S. Balamuralitharan

Abstract This paper deals with a general SEIR model for the coronavirus disease 2019 (COVID-19) with the effect of time delay proposed. We get the stability theorems for the disease-free equilibrium and provide adequate situations of the COVID-19 transmission dynamics equilibrium of present and absent cases. A Hopf bifurcation parameter τ concerns the effects of time delay and we demonstrate that the locally asymptotic stability holds for the present equilibrium. The reproduction number is brief in less than or greater than one, and it effectively is controlling the COVID-19 infection outbreak and subsequently reveals insight into understanding the patterns of the flare-up. We have included eight parameters and the least square method allows us to estimate the initial values for the Indian COVID-19 pandemic from real-life data. It is one of India’s current pandemic models that have been studied for the time being. This Covid19 SEIR model can apply with or without delay to all country’s current pandemic region, after estimating parameter values from their data. The sensitivity of seven parameters has also been explored. The paper also examines the impact of immune response time delay and the importance of determining essential parameters such as the transmission rate using sensitivity indices analysis. The numerical experiment is calculated to illustrate the theoretical results.


Sign in / Sign up

Export Citation Format

Share Document