scholarly journals Spatially-encouraged spectral clustering: a technique for blending map typologies and regionalization

2018 ◽  
Author(s):  
Levi John Wolf

Clustering is a central concern in geographic data science and reflect a large, ongoing domain of research. In applied problems, it is often challenging to balance the two notions of coherence in spatial clustering problems: that of "feature" coherence, where detected clusters are internally homogeneous, and "spatial'" coherence, where detected clusters can be interpreted to represent a geographical place. While recent work has aimed to relax this tension, progress in spectral clustering methods, developed for machine learning and image segmentation, provide a useful framework to do this. This paper shows how spatial and feature coherence can be balanced using kernel combination in spectral clustering. This ensures the preservation of geographical constraints (like contiguity or compactness) while also providing the ability to relax these constraints linearly. Further, some kinds of kernel combination methods have significantly different behavior and meaning from another commonly-used method to balance objectives: convex combination. Altogether, spatially-encouraged spectral clustering is proposed as a novel spatial analysis method that bridges regionalization and spatial clustering.

Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 596
Author(s):  
Krishna Kumar Sharma ◽  
Ayan Seal ◽  
Enrique Herrera-Viedma ◽  
Ondrej Krejcar

Calculating and monitoring customer churn metrics is important for companies to retain customers and earn more profit in business. In this study, a churn prediction framework is developed by modified spectral clustering (SC). However, the similarity measure plays an imperative role in clustering for predicting churn with better accuracy by analyzing industrial data. The linear Euclidean distance in the traditional SC is replaced by the non-linear S-distance (Sd). The Sd is deduced from the concept of S-divergence (SD). Several characteristics of Sd are discussed in this work. Assays are conducted to endorse the proposed clustering algorithm on four synthetics, eight UCI, two industrial databases and one telecommunications database related to customer churn. Three existing clustering algorithms—k-means, density-based spatial clustering of applications with noise and conventional SC—are also implemented on the above-mentioned 15 databases. The empirical outcomes show that the proposed clustering algorithm beats three existing clustering algorithms in terms of its Jaccard index, f-score, recall, precision and accuracy. Finally, we also test the significance of the clustering results by the Wilcoxon’s signed-rank test, Wilcoxon’s rank-sum test, and sign tests. The relative study shows that the outcomes of the proposed algorithm are interesting, especially in the case of clusters of arbitrary shape.


2021 ◽  
Vol 10 (3) ◽  
pp. 161
Author(s):  
Hao-xuan Chen ◽  
Fei Tao ◽  
Pei-long Ma ◽  
Li-na Gao ◽  
Tong Zhou

Spatial analysis is an important means of mining floating car trajectory information, and clustering method and density analysis are common methods among them. The choice of the clustering method affects the accuracy and time efficiency of the analysis results. Therefore, clarifying the principles and characteristics of each method is the primary prerequisite for problem solving. Taking four representative spatial analysis methods—KMeans, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Clustering by Fast Search and Find of Density Peaks (CFSFDP), and Kernel Density Estimation (KDE)—as examples, combined with the hotspot spatiotemporal mining problem of taxi trajectory, through quantitative analysis and experimental verification, it is found that DBSCAN and KDE algorithms have strong hotspot discovery capabilities, but the heat regions’ shape of DBSCAN is found to be relatively more robust. DBSCAN and CFSFDP can achieve high spatial accuracy in calculating the entrance and exit position of a Point of Interest (POI). KDE and DBSCAN are more suitable for the classification of heat index. When the dataset scale is similar, KMeans has the highest operating efficiency, while CFSFDP and KDE are inferior. This paper resolves to a certain extent the lack of scientific basis for selecting spatial analysis methods in current research. The conclusions drawn in this paper can provide technical support and act as a reference for the selection of methods to solve the taxi trajectory mining problem.


2020 ◽  
Vol 77 (8) ◽  
pp. 1409-1420
Author(s):  
Robyn E. Forrest ◽  
Ian J. Stewart ◽  
Cole C. Monnahan ◽  
Katherine H. Bannar-Martin ◽  
Lisa C. Lacko

The British Columbia longline fishery for Pacific halibut (Hippoglossus stenolepis) has experienced important recent management changes, including the introduction of comprehensive electronic catch monitoring on all vessels; an integrated transferable quota system; a reduction in Pacific halibut quotas; and, beginning in 2016, sharp decreases in quota for yelloweye rockfish (Sebastes ruberrimus, an incidentally caught species). We describe this fishery before integration, after integration, and after the yelloweye rockfish quota reduction using spatial clustering methods to define discrete fishing opportunities. We calculate the relative utilization of these fishing opportunities and their overlap with areas with high encounter rates of yelloweye rockfish during each of the three periods. The spatial footprint (area fished) increased before integration, then decreased after integration. Each period showed shifts in utilization among four large fishing areas. Immediately after the reductions in yelloweye rockfish quota, fishing opportunities with high encounter rates of yelloweye rockfish had significantly lower utilization than areas with low encounter rates, implying rapid avoidance behaviour.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yonghua Tang ◽  
Qiang Fan ◽  
Peng Liu

The traditional teaching model cannot adapt to the teaching needs of the era of smart teaching. Based on this, this paper combines data mining technology to carry out teaching reforms, constructs a computer-aided system based on data mining, and constructs teaching system functions based on actual conditions. The constructed system can carry out multisubject teaching. Moreover, this paper uses a data mining system to mine teaching resources and uses spectral clustering methods to integrate multiple teaching resources to improve the practicability of data mining algorithms. In addition, this paper combines digital technology to deal with teaching resources. Finally, after building the system, this paper designs experiments to verify the performance of the system. From the research results, it can be seen that the system constructed in this paper has certain teaching and practical effects, and it can be applied to a larger teaching scope in subsequent research.


Data clustering is an active topic of research as it has applications in various fields such as biology, management, statistics, pattern recognition, etc. Spectral Clustering (SC) has gained popularity in recent times due to its ability to handle complex data and ease of implementation. A crucial step in spectral clustering is the construction of the affinity matrix, which is based on a pairwise similarity measure. The varied characteristics of datasets affect the performance of a spectral clustering technique. In this paper, we have proposed an affinity measure based on Topological Node Features (TNFs) viz., Clustering Coefficient (CC) and Summation index (SI) to define the notion of density and local structure. It has been shown that these features improve the performance of SC in clustering the data. The experiments were conducted on synthetic datasets, UCI datasets, and the MNIST handwritten datasets. The results show that the proposed affinity metric outperforms several recent spectral clustering methods in terms of accuracy.


Circulation ◽  
2013 ◽  
Vol 127 (suppl_12) ◽  
Author(s):  
Kosuke Tamura ◽  
Robin C Puett ◽  
Jaime E Hart ◽  
Heather A Starnes ◽  
Francine Laden ◽  
...  

Introduction: Spatial clustering methods have been applied to cancer for over a decade. These methods have been used in studies on physical activity (PA) and obesity. One recent study examined differences in built environment attributes inside and outside PA clusters. We tested two hypotheses: 1) PA and obesity would spatially cluster in older women; and 2) built environment attributes typically related to higher walkability would be found in high PA clusters, while attributes related to lower walkability would appear in high obesity clusters. Methods: We used data from 22,589 Nurses’ Health Study participants (mean age = 69.9 ± 6.8y) in California, Massachusetts, and Pennsylvania. Two outcomes were examined: meeting PA guidelines via self-reported walking (≥ 500 MET-min/week) and obesity (BMI ≥ 30.0). Objective built environment variables were created: population and intersection density, diversity of facilities, and facility density. We used a spatial scan statistic to detect clusters (i.e., areas with high or low rates) of the two outcomes. Built environment attributes were compared inside and outside clusters. Results: Six spatial clusters of PA were found in California and Massachusetts. Two obesity clusters were found in Pennsylvania. Overall there were significant differences (p<0.05) in population and intersection density, and diversity and density of facilities inside and outside clusters. In some cases, built environment attributes related to higher walkability appeared in high PA clusters, while in other PA clusters we did not find this pattern. Differences in built environment attributes inside and outside obesity clusters showed inconsistent patterns. Conclusion: Although PA and obesity clusters emerged, the comparison of built environment attributes inside and outside clusters revealed a complex picture not fully consistent with existing literature. Further examination of PA and obesity clusters in older adults should include other built environment factors that may be related to these outcomes.


2015 ◽  
Vol 76 (1) ◽  
Author(s):  
Ang Jun Chin ◽  
Andri Mirzal ◽  
Habibollah Haron

Gene expression profile is eminent for its broad applications and achievements in disease discovery and analysis, especially in cancer research. Spectral clustering is robust to irrelevant features which are appropriated for gene expression analysis. However, previous works show that performance comparison with other clustering methods is limited and only a few microarray data sets were analyzed in each study. In this study, we demonstrate the use of spectral clustering in identifying cancer types or subtypes from microarray gene expression profiling. Spectral clustering was applied to eleven microarray data sets and its clustering performances were compared with the results in the literature. Based on the result, overall the spectral clustering slightly outperformed the corresponding results in the literature. The spectral clustering can also offer more stable clustering performances as it has smaller standard deviation value. Moreover, out of eleven data sets the spectral clustering outperformed the corresponding methods in the literature for six data sets. So, it can be stated that the spectral clustering is a promising method in identifying the cancer types or subtypes for microarray gene expression data sets.


GeoJournal ◽  
2020 ◽  
Author(s):  
Lília Aparecida Marques da Silva ◽  
José Ueleres Braga ◽  
João Pereira da Silva ◽  
Maria do Socorro Pires e Cruz ◽  
André Luiz Sá de Oliveira ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document