scholarly journals mgwr: A Python implementation of multiscale geographically weighted regression for investigating process spatial heterogeneity and scale

Author(s):  
Taylor Oshan ◽  
Ziqi Li ◽  
Wei Kang ◽  
Levi John Wolf ◽  
Alexander Stewart Fotheringham

Geographically weighted regression (GWR) is a spatial statistical technique that recognizes traditional 'global' regression models may be limited when spatial processes vary with spatial context. GWR captures process spatial heterogeneity via an operationalization of Tobler's first law of geography: "everything is related to everything else, but near things are more related than distant things" (1970). An ensemble of local linear models are calibrated at any number of locations by 'borrowing' nearby data. The result is a surface of location-specific parameter estimates for each relationship in the model that may vary spatially, as well as a single bandwidth parameter that provides intuition about the geographic scale of the processes. A recent extension to this framework allows each relationship to vary according to a distinct spatial scale parameter, and is therefore known as multiscale (M)GWR. This paper introduces mgwr, a Python-based implementation for efficiently calibrating a variety of (M)GWR models and a selection of associated diagnostics. It reviews some core concepts, introduces the primary software functionality, and demonstrates suggested usage on several example datasets.

2019 ◽  
Vol 8 (6) ◽  
pp. 269 ◽  
Author(s):  
Taylor Oshan ◽  
Ziqi Li ◽  
Wei Kang ◽  
Levi Wolf ◽  
A. Fotheringham

Geographically weighted regression (GWR) is a spatial statistical technique that recognizes that traditional ‘global’ regression models may be limited when spatial processes vary with spatial context. GWR captures process spatial heterogeneity by allowing effects to vary over space. To do this, GWR calibrates an ensemble of local linear models at any number of locations using ‘borrowed’ nearby data. This provides a surface of location-specific parameter estimates for each relationship in the model that is allowed to vary spatially, as well as a single bandwidth parameter that provides intuition about the geographic scale of the processes. A recent extension to this framework allows each relationship to vary according to a distinct spatial scale parameter, and is therefore known as multiscale (M)GWR. This paper introduces mgwr, a Python-based implementation of MGWR that explicitly focuses on the multiscale analysis of spatial heterogeneity. It provides novel functionality for inference and exploratory analysis of local spatial processes, new diagnostics unique to multi-scale local models, and drastic improvements to efficiency in estimation routines. We provide two case studies using mgwr, in addition to reviewing core concepts of local models. We present this in a literate programming style, providing an overview of the primary software functionality and demonstrations of suggested usage alongside the discussion of primary concepts and demonstration of the improvements made in mgwr.


Energies ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 3232
Author(s):  
Feili Wei ◽  
Shuang Li ◽  
Ze Liang ◽  
Aiqiong Huang ◽  
Zheng Wang ◽  
...  

Deteriorating air quality is one of the most important environmental factors posing significant health risks to urban dwellers. Therefore, an exploration of the factors influencing air pollution and the formulation of targeted policies to address this issue are critically needed. Although many studies have used semi-parametric geographically weighted regression and geographically weighted regression to study the spatial heterogeneity characteristics of influencing factors of PM2.5 concentration change, due to the fixed bandwidth of these methods and other reasons, those studies still lack the ability to describe and explain cross-scale dynamics. The multi-scale geographically weighted regression (MGWR) method allows different variables to have different bandwidths, which can produce more realistic and useful spatial process models. By applying the MGWR method, this study investigated the spatial heterogeneity and spatial scales of impact of factors influencing PM2.5 concentrations in major Chinese cities during the period 2005–2015. This study showed the following: (1) Factors influencing changes in PM2.5 concentrations, such as technology, foreign investment levels, wind speed, precipitation, and Normalized Difference Vegetation Index (NDVI), evidenced significant spatial heterogeneity. Of these factors, precipitation, NDVI, and wind speed had small-scale regional effects, whose bandwidth ratios are all less than 20%, while foreign investment levels and technologies had medium-scale regional effects, whose bandwidth levels are 23% and 32%, respectively. Population, urbanization rates, and industrial structure demonstrated weak spatial heterogeneity, and the scale of their influence was predominantly global. (2) Overall, the change of NDVI was the most influential factor, which can explain 15.3% of the PM2.5 concentration change. Therefore, an enhanced protection of urban surface vegetation would be of universal significance. In some typical areas, dominant factors influencing pollution were evidently heterogeneous. Change in wind speed is a major factor that can explain 51.6% of the change in PM2.5 concentration in cities in the Central Plains, and change in foreign investment levels is the dominant influencing factor in cities in the Yunnan-Guizhou Plateau and the Sichuan Basin, explaining 30.6% and 44.2% of the PM2.5 concentration change, respectively. In cities located within the lower reaches of the Yangtze River, NDVI is a key factor, reducing PM2.5 concentrations by 9.7%. Those results can facilitate the development of region-specific measures and tailored urban policies to reduce PM2.5 pollution levels in different regions such as Northeast China and the Sichuan Basin.


2019 ◽  
Vol 33 (1) ◽  
pp. 155-175 ◽  
Author(s):  
Li ◽  
Fotheringham ◽  
Li ◽  
Oshan

Geographically Weighted Regression (GWR) is a widely used tool for exploring spatial heterogeneity of processes over geographic space. GWR computes location-specific parameter estimates, which makes its calibration process computationally intensive. The maximum number of data points that can be handled by current open-source GWR software is approximately 15,000 observations on a standard desktop. In the era of big data, this places a severe limitation on the use of GWR. To overcome this limitation, we propose a highly scalable, open-source FastGWR implementation based on Python and the Message Passing Interface (MPI) that scales to the order of millions of observations. FastGWR optimizes memory usage along with parallelization to boost performance significantly. To illustrate the performance of FastGWR, a hedonic house price model is calibrated on approximately 1.3 million single-family residential properties from a Zillow dataset for the city of Los Angeles, which is the first effort to apply GWR to a dataset of this size. The results show that FastGWR scales linearly as the number of cores within the High-Performance Computing (HPC) environment increases. It also outperforms currently available open-sourced GWR software packages with drastic speed reductions – up to thousands of times faster – on a standard desktop.


2019 ◽  
Vol 8 (1) ◽  
pp. 27
Author(s):  
MOCH. ANJAS A ◽  
I KOMANG GDE SUKARSA ◽  
I PUTU EKA NILA KENCANA

Geographically weighted regression (GWR) analysis is an analysis to resolve the problem with data contains effect of spatial heterogeneity. One of the problems which considers spatial heterogeneity is pneumonia. Pneumonia is spread of disease as  cause of infants’ and toddlers’ death. One of the provinces with the largest of pneumonia is East Java. The purpose of this research  is modeling of pneumonia in East Java using GWR method. The results of this research showed factors dominant and significantly of pneumonia in East Java, those factors are households of PHBS and present of measles immunization.


2020 ◽  
Vol 9 (6) ◽  
pp. 380
Author(s):  
Radosław Cellmer ◽  
Aneta Cichulska ◽  
Mirosław Bełej

The main part of the study will be to demonstrate that models taking into account spatial heterogeneity (Geographically Weighted Regression and Mixed Geographically Weighted Regression) which reproduce housing market determinants better reflect market relationships than conventional regression models. The spatial heterogeneity of the housing market determinants results in the spatial diversity of the market activity, as well as of real estate prices and values. The main aim of the study was to analyse an effect of these socio-demographic and environmental factors on average housing property prices and on the number of transactions in a spatial approach. In previous research conducted on a national scale, usually all variables were treated in a similar way, i.e., as global or local variables. During the research, an attempt was also made to answer the question of which of the variables adopted for analysis have a local impact on prices and market activity, and which are global. The study was conducted in Poland and used data from the year 2018 on 380 counties (Local Administrative Units). The study showed that determinants both for average prices and for the housing market activity show spatial autocorrelation with high–high and low–low cluster groups. Owing to these models, it was possible to draw specific conclusions on local determinants of flat prices and the market activity in Poland. The study findings have confirmed that they are an extremely effective tool for spatial data analysis.


2017 ◽  
Vol 21 (1) ◽  
pp. 165
Author(s):  
Jitendra Parajuli ◽  
Kingsley Haynes

<p><strong>Purpose:</strong> This paper examines the spatial heterogeneity associated with broadband Internet and new firm formation in a number of U.S. states.</p><p><strong>Methodology/Approach:</strong> Both ordinary least-squares regression and Geographically Weighted Regression are used for the estimation purpose.</p><p><strong>Findings:</strong> The global coefficient estimates of ordinary least-squares regression account for the marginal change in a phenomenon, but such a global measure cannot reveal the locally-varying dynamics. Using Geographically Weighted Regression, it was found that at the aggregate and economic sector levels, the association between single-unit firm births and the provision of broadband Internet varies across counties in Florida and Ohio.</p><p><strong>Originality/Value of paper:</strong> There are numerous studies on broadband Internet in the U.S., but this is the first that explicitly examines broadband provision and new firm formation by taking into account spatial heterogeneity across countries.</p>


2019 ◽  
Author(s):  
Ziqi Li ◽  
Alexander Stewart Fotheringham ◽  
Taylor M. Oshan ◽  
Levi John Wolf

Bandwidth, a key parameter in geographically weighted regression models, is closely related to the spatial scale at which the underlying spatially heterogeneous processes being examined take place. Generally, a single optimal bandwidth (geographically weighted regression) or a set of covariate-specific optimal bandwidths (multiscale geographically weighted regression) is chosen based on some criterion such as the Akaike Information Criterion (AIC) and then parameter estimation and inference are conditional on the choice of this bandwidth. In this paper, we find that bandwidth selection is subject to uncertainty in both single-scale and multi-scale geographically weighted regression models and demonstrate that this uncertainty can be measured and accounted for. Based on simulation studies and an empirical example of obesity rates in Phoenix, we show that bandwidth uncertainties can be quantitatively measured by Akaike weights, and confidence intervals for bandwidths can be obtained. Understanding bandwidth uncertainty offers important insights about the scales over which different processes operate, especially when comparing covariate-specific bandwidths. Additionally, unconditional parameter estimates can be computed based on Akaike weights accounts for bandwidth selection uncertainty.


2018 ◽  
Author(s):  
Hanchen Yu ◽  
Stewart Fotheringham ◽  
Ziqi Li ◽  
Taylor Oshan ◽  
Wei Kang ◽  
...  

A recent paper (Fotheringham et al. 2017) expands the well-known Geographically Weighted Regression (GWR) framework significantly by allowing the bandwidth or smoothing factor in GWR to be derived separately for each covariate in the model – a framework referred to as Multiscale GWR (MGWR). However, one limitation of the MGWR framework is that, until now, no inference about the local parameter estimates was possible. Formally, the so-called “hat matrix,” which projects the observed response vector into the predicted response vector, was available in GWR but not in MGWR. This paper addresses this limitation by reframing GWR as a Generalized Additive Model (GAM), extending this framework to MGWR and then deriving standard errors for the local parameters in MGWR. In addition, we also demonstrate how the effective number of parameters (ENP) can be obtained for the overall fit of an MGWR model and for each of the covariates within the model. This statistic is essential for comparing model fit between MGWR, GWR, and traditional global models, as well as adjusting for multiple hypothesis tests. We demonstrate these advances to the MGWR framework with both a simulated data set and a real-world data set.


Sign in / Sign up

Export Citation Format

Share Document