Applicability of an outlier detection technique based on penalized estimation in the context of linear regression

2020 ◽  
Vol 33 (3) ◽  
pp. 707-736
Author(s):  
Wooyoul Na ◽  
Yeong Jin Jo ◽  
Hyun Sook Lee
Author(s):  
A. A. M. Nurunnabi ◽  
A. H. M. Rahmatullah Imon ◽  
A. B. M. Shawkat Ali ◽  
Mohammed Nasser

Regression analysis is one of the most important branches of multivariate statistical techniques. It is widely used in almost every field of research and application in multifactor data, which helps to investigate and to fit an unknown model for quantifying relations among observed variables. Nowadays, it has drawn a large attention to perform the tasks with neural networks, support vector machines, evolutionary algorithms, et cetera. Till today, least squares (LS) is the most popular parameter estimation technique to the practitioners, mainly because of its computational simplicity and underlying optimal properties. It is well-known by now that the method of least squares is a non-resistant fitting process; even a single outlier can spoil the whole estimation procedure. Data contamination by outlier is a practical problem which certainly cannot be avoided. It is very important to be able to detect these outliers. The authors are concerned about the effect outliers have on parameter estimates and on inferences about models and their suitability. In this chapter the authors have made a short discussion of the most well known and efficient outlier detection techniques with numerical demonstrations in linear regression. The chapter will help the people who are interested in exploring and investigating an effective mathematical model. The goal is to make the monograph self-contained maintaining its general accessibility.


2005 ◽  
Vol 08 (04) ◽  
pp. 433-449 ◽  
Author(s):  
FERNANDO A. QUINTANA ◽  
PILAR L. IGLESIAS ◽  
HELENO BOLFARINE

The problem of outlier and change-point identification has received considerable attention in traditional linear regression models from both, classical and Bayesian standpoints. In contrast, for the case of regression models with measurement errors, also known as error-in-variables models, the corresponding literature is scarce and largely focused on classical solutions for the normal case. The main object of this paper is to propose clustering algorithms for outlier detection and change-point identification in scale mixture of error-in-variables models. We propose an approach based on product partition models (PPMs) which allows one to study clustering for the models under consideration. This includes the change-point problem and outlier detection as special cases. The outlier identification problem is approached by adapting the algorithms developed by Quintana and Iglesias [32] for simple linear regression models. A special algorithm is developed for the change-point problem which can be applied in a more general setup. The methods are illustrated with two applications: (i) outlier identification in a problem involving the relationship between two methods for measuring serum kanamycin in blood samples from babies, and (ii) change-point identification in the relationship between the monthly dollar volume of sales on the Boston Stock Exchange and the combined monthly dollar volumes for the New York and American Stock Exchanges.


Sign in / Sign up

Export Citation Format

Share Document