scholarly journals Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification

Symmetry ◽  
2021 ◽  
Vol 13 (11) ◽  
pp. 2030
Author(s):  
Ali Mohammed Baba ◽  
Habshah Midi ◽  
Mohd Bakri Adam ◽  
Nur Haizum Abd Rahman

Influential observations (IOs), which are outliers in the x direction, y direction or both, remain a problem in the classical regression model fitting. Spatial regression models have a peculiar kind of outliers because they are local in nature. Spatial regression models are also not free from the effect of influential observations. Researchers have adapted some classical regression techniques to spatial models and obtained satisfactory results. However, masking or/and swamping remains a stumbling block for such methods. In this article, we obtain a measure of spatial Studentized prediction residuals that incorporate spatial information on the dependent variable and the residuals. We propose a robust spatial diagnostic plot to classify observations into regular observations, vertical outliers, good and bad leverage points using a classification based on spatial Studentized prediction residuals and spatial diagnostic potentials, which we refer to as and . Observations that fall into the vertical outliers and bad leverage points categories are referred to as IOs. Representations of some classical regression measures of diagnostic in general spatial models are presented. The commonly used diagnostic measure in spatial diagnostics, the Cook’s distance, is compared to some robust methods, (using robust and non-robust measures), and our proposed and plots. Results of our simulation study and applications to real data showed that the Cook’s distance, non-robust and robust were not very successful in detecting IOs. The suffered from the masking effect, and the robust suffered from swamping in general spatial models. Interestingly, the results showed that the proposed plot, followed by the plot, was very successful in classifying observations into the correct groups, hence correctly detecting the real IOs.

Author(s):  
Ali Mohammed Baba ◽  
Habshah Midi ◽  
Mohd Bakri Adam ◽  
Nur Haizum Bint Abd Rahman

Influential Observations, which are outliers in x direction, y direction or both, remain a hitch in classical regression model fitting. Spatial regression model, with peculiar nature of outliers due to their local nature, is not free from the effect of such influential observations. Researchers have adapted some classical regression techniques to the spatial models and yielded satisfactory results. However, masking or/and swamping remain stumbling block to such methods. We obtained the spatial representation of the classical regression measures of diagnostic in general spatial model. Commonly used diagnostic measure in spatial diagnostic, the Cook's distance, is compared to some robust methods, Hi2 (using robust and non-robust measures), and classification based on generalized residuals and diagnostic generalized potentials, ISRs-Posi and ESRs-Posi, with the help of the obtained spatial prediction residuals and the spatial leverage term. Results of simulation and applications to real data have shown the advantage of the ISRs-Posi and ESRs-Posi due to classification of outliers over Cook's distance and non-robust Hsi12, which suffer from masking, and robust Hsi22 which suffer from swamping in general spatial model.


Author(s):  
Zisis Mallios

Hedonic pricing is an indirect valuation method that applies to heterogeneous goods investigating the relationship between the prices of tradable goods and their attributes. It can be used to measure the value of irrigation water through the estimation of the model that describes the relation between the market value of the land parcels and its characteristics. Because many of the land parcels included in a hedonic pricing model are spatial in nature, the conventional regression analysis fails to incorporate all the available information. Spatial regression models can achieve more efficient estimates because they are designed to deal with the spatial dependence of the data. In this paper, the authors present the results of an application of the hedonic pricing method on irrigation water valuation obtained using a software tool that is developed for the ArcGIS environment. This tool incorporates, in the GIS application, the estimation of two different spatial regression models, the spatial lag model and the spatial error model. It also has the option for different specifications of the spatial weights matrix, giving the researcher the opportunity to examine how it affects the overall performance of the model.


Author(s):  
Zisis Mallios

Hedonic pricing is an indirect valuation method that applies to heterogeneous goods investigating the relationship between the prices of tradable goods and their attributes. It can be used to measure the value of irrigation water through the estimation of the model that describes the relation between the market value of the land parcels and its characteristics. Because many of the land parcels included in a hedonic pricing model are spatial in nature, the conventional regression analysis fails to incorporate all the available information. Spatial regression models can achieve more efficient estimates because they are designed to deal with the spatial dependence of the data. In this paper, the authors present the results of an application of the hedonic pricing method on irrigation water valuation obtained using a software tool that is developed for the ArcGIS environment. This tool incorporates, in the GIS application, the estimation of two different spatial regression models, the spatial lag model and the spatial error model. It also has the option for different specifications of the spatial weights matrix, giving the researcher the opportunity to examine how it affects the overall performance of the model.


2020 ◽  
Vol 9 (10) ◽  
pp. 577
Author(s):  
Daisuke Murakami ◽  
Mami Kajita ◽  
Seiji Kajita

A rapid growth in spatial open datasets has led to a huge demand for regression approaches accommodating spatial and non-spatial effects in big data. Regression model selection is particularly important to stably estimate flexible regression models. However, conventional methods can be slow for large samples. Hence, we develop a fast and practical model-selection approach for spatial regression models, focusing on the selection of coefficient types that include constant, spatially varying, and non-spatially varying coefficients. A pre-processing approach, which replaces data matrices with small inner products through dimension reduction, dramatically accelerates the computation speed of model selection. Numerical experiments show that our approach selects a model accurately and computationally efficiently, highlighting the importance of model selection in the spatial regression context. Then, the present approach is applied to open data to investigate local factors affecting crime in Japan. The results suggest that our approach is useful not only for selecting factors influencing crime risk but also for predicting crime events. This scalable model selection will be key to appropriately specifying flexible and large-scale spatial regression models in the era of big data. The developed model selection approach was implemented in the R package spmoran.


Sign in / Sign up

Export Citation Format

Share Document