Based on GIS Spatial Clustering Algorithm Research

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.971-973.1565 ◽

2014 ◽

Vol 971-973 ◽

pp. 1565-1568

Author(s):

Zhi Yong Wang

Keyword(s):

Objective Function ◽

Spatial Data ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Direct Access ◽

Cluster Center ◽

Clustering Methods ◽

Concept Clustering ◽

Spatial Data Management ◽

Calculated Distance

Facing the particularity of the current limitations and spatial clustering clustering methods, the objective function from concept clustering starting to GIS spatial data management and spatial analysis for technical support, explores the space between the sample direct access to the distance calculated distance and indirect reach up costs. K samples randomly selected as the cluster center, with space for the sample to reach the center of each cluster sample is divided according to the distance, the sum of the spatial clustering center of the sample to reach its cost objective function for clustering, introduction of genetic algorithm, a spatial clustering algorithm based on GIS. Finally, the algorithm is tested by examples.

Download Full-text

A method for efficient clustering of spatial data in network space

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202806 ◽

2021 ◽

pp. 1-18

Author(s):

Trang T.D. Nguyen ◽

Loan T.T. Nguyen ◽

Anh Nguyen ◽

Unil Yun ◽

Bay Vo

Keyword(s):

Spatial Data ◽

Euclidean Distance ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Spatial Data Mining ◽

Spatial Data Analysis ◽

Clustering Methods ◽

Dbscan Algorithm ◽

Density Based Clustering ◽

Network Space

Spatial clustering is one of the main techniques for spatial data mining and spatial data analysis. However, existing spatial clustering methods primarily focus on points distributed in planar space with the Euclidean distance measurement. Recently, NS-DBSCAN has been developed to perform clustering of spatial point events in Network Space based on a well-known clustering algorithm, named Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The NS-DBSCAN algorithm has efficiently solved the problem of clustering network constrained spatial points. When compared to the NC_DT (Network-Constraint Delaunay Triangulation) clustering algorithm, the NS-DBSCAN algorithm efficiently solves the problem of clustering network constrained spatial points by visualizing the intrinsic clustering structure of spatial data by constructing density ordering charts. However, the main drawback of this algorithm is when the data are processed, objects that are not specifically categorized into types of clusters cannot be removed, which is undeniably a waste of time, particularly when the dataset is large. In an attempt to have this algorithm work with great efficiency, we thus recommend removing edges that are longer than the threshold and eliminating low-density points from the density ordering table when forming clusters and also take other effective techniques into consideration. In this paper, we develop a theorem to determine the maximum length of an edge in a road segment. Based on this theorem, an algorithm is proposed to greatly improve the performance of the density-based clustering algorithm in network space (NS-DBSCAN). Experiments using our proposed algorithm carried out in collaboration with Ho Chi Minh City, Vietnam yield the same results but shows an advantage of it over NS-DBSCAN in execution time.

Download Full-text

Tree-ART2 Learning Model for Spatial Clustering in Second Dimension

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.543-547.1934 ◽

2014 ◽

Vol 543-547 ◽

pp. 1934-1938

Author(s):

Ming Xiao

Keyword(s):

Network Model ◽

Spatial Data ◽

Data Clustering ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Adaptive Resonance Theory ◽

Spatial Distance ◽

Resonance Theory ◽

Adaptive Resonance ◽

Vector Module

For a clustering algorithm in two-dimension spatial data, the Adaptive Resonance Theory exists not only the shortcomings of pattern drift and vector module of information missing, but also difficultly adapts to spatial data clustering which is irregular distribution. A Tree-ART2 network model was proposed based on the above situation. It retains the memory of old model which maintains the constraint of spatial distance by learning and adjusting LTM pattern and amplitude information of vector. Meanwhile, introducing tree structure to the model can reduce the subjective requirement of vigilance parameter and decrease the occurrence of pattern mixing. It is showed that TART2 network has higher plasticity and adaptability through compared experiments.

Download Full-text

DBSCANI: Noise-Resistant Method for Missing Value Imputation

Journal of Intelligent Systems ◽

10.1515/jisys-2014-0172 ◽

2016 ◽

Vol 25 (3) ◽

pp. 431-440 ◽

Cited By ~ 1

Author(s):

Archana Purwar ◽

Sandeep Kumar Singh

Keyword(s):

Spatial Data ◽

Missing Values ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Data Sets ◽

Quality Of Data ◽

Data Set ◽

Dbscan Clustering ◽

Density Based Clustering

AbstractThe quality of data is an important task in the data mining. The validity of mining algorithms is reduced if data is not of good quality. The quality of data can be assessed in terms of missing values (MV) as well as noise present in the data set. Various imputation techniques have been studied in MV study, but little attention has been given on noise in earlier work. Moreover, to the best of knowledge, no one has used density-based spatial clustering of applications with noise (DBSCAN) clustering for MV imputation. This paper proposes a novel technique density-based imputation (DBSCANI) built on density-based clustering to deal with incomplete values in the presence of noise. Density-based clustering algorithm proposed by Kriegal groups the objects according to their density in spatial data bases. The high-density regions are known as clusters, and the low-density regions refer to the noise objects in the data set. A lot of experiments have been performed on the Iris data set from life science domain and Jain’s (2D) data set from shape data sets. The performance of the proposed method is evaluated using root mean square error (RMSE) as well as it is compared with existing K-means imputation (KMI). Results show that our method is more noise resistant than KMI on data sets used under study.

Download Full-text

A FAST IMPLEMENTATION OF THE ISODATA CLUSTERING ALGORITHM

International Journal of Computational Geometry & Applications ◽

10.1142/s0218195907002252 ◽

2007 ◽

Vol 17 (01) ◽

pp. 71-103 ◽

Cited By ~ 93

Author(s):

NARGESS MEMARSADEGHI ◽

DAVID M. MOUNT ◽

NATHAN S. NETANYAHU ◽

JACQUELINE LE MOIGNE

Keyword(s):

Clustering Algorithm ◽

Empirical Studies ◽

Synthetic Data ◽

Large Data ◽

Large Data Sets ◽

Cluster Center ◽

Data Sets ◽

Clustering Methods ◽

Sensing Applications ◽

Remote Sensing Applications

Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. We present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a kd-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. We also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point. We provide both theoretical and empirical justification that our modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. We also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that our approach has significantly lower running times.

Download Full-text

Study on Fuzzy Clustering Algorithm of Spatial Data Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.416-417.1244 ◽

2013 ◽

Vol 416-417 ◽

pp. 1244-1250

Author(s):

Ting Ting Zhao

Keyword(s):

Data Mining ◽

Fuzzy Clustering ◽

Spatial Data ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Rapid Development ◽

Spatial Database ◽

Spatial Data Mining ◽

Data Set ◽

Fuzzy Similarity

With rapid development of space information crawl technology, different types of spatial database and data size of spatial database increases continuously. How to extract valuable information from complicated spatial data has become an urgent issue. Spatial data mining provides a new thought for solving the problem. The paper introduces fuzzy clustering into spatial data clustering field, studies the method that fuzzy set theory is applied to spatial data mining, proposes spatial clustering algorithm based on fuzzy similar matrix, fuzzy similarity clustering algorithm. The algorithm not only can solve the disadvantage that fuzzy clustering cant process large data set, but also can give similarity measurement between objects.

Download Full-text

An Efficient Grid Cell Based Spatial Clustering Algorithm for Spatial Data Mining

The KIPS Transactions PartD ◽

10.3745/kipstd.2003.10d.4.567 ◽

2003 ◽

Vol 10D (4) ◽

pp. 567-576

Keyword(s):

Data Mining ◽

Spatial Data ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Grid Cell ◽

Spatial Data Mining

Download Full-text

Spatial Clustering Analysis of K-Means Algorithm in the Classification of Bank Card Customers

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.687-691.1274 ◽

2014 ◽

Vol 687-691 ◽

pp. 1274-1277

Author(s):

Kang Lv

Keyword(s):

Data Mining ◽

Financial Services ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Spatial Relationship ◽

Data Bank ◽

Customer Relationship ◽

Current Status ◽

Clustering Methods ◽

Bank Card

K-means algorithm is a simple and efficient data mining clustering algorithm. For the current status of the bank card customer relationship management, based on data mining technology, design based on K-means clustering algorithm banking customer classification system. Data mining techniques can extract vast amounts of customer information data bank card implicit knowledge and spatial relationship model will represent the bank customers feature set of data objects automatically classified into each composed of clusters of similar objects, bank card customers in the banking system classification. This paper analyzes the existing spatial clustering methods summary and conclusion, based on the combined data bank card customers, according to the volatility of funds used to different customer groups, the use of K-means analysis to study characteristics of client groups, providing appropriate financial services.

Download Full-text

An Adaptive Sweep-circle Spatial Clustering Algorithm Based on Gestalt

10.20944/preprints201708.0040.v1 ◽

2017 ◽

Author(s):

Qingming Zhan ◽

Shuguang Deng ◽

Zhihua Zheng

Keyword(s):

Spatial Data ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Data Streaming ◽

Spatial Clusters ◽

Streaming Technology ◽

Threshold Setting

An adaptive spatial clustering (ASC) algorithm is proposed that employs sweep-circle techniques and a dynamic threshold setting based on Gestalt theory to detect spatial clusters. The proposed algorithm can automatically discover clusters in one pass, rather than through the modification of the initial model (for example, a minimal spanning tree, Delaunay triangulation, or Voronoi diagram). It can quickly identify arbitrarily shaped clusters while adapting efficiently to non-homogeneous density characteristics of spatial data, without the need of priori knowledge or parameters. The proposed algorithm is also ideal for use in data streaming technology with dynamic characteristics flowing in the form of spatial clustering large data sets.

Download Full-text

Spatial machine learning: new opportunities for regional science

The Annals of Regional Science ◽

10.1007/s00168-021-01101-x ◽

2021 ◽

Author(s):

Katarzyna Kopczewska

Keyword(s):

Machine Learning ◽

Spatial Econometrics ◽

Spatial Data ◽

Quantitative Methods ◽

Cross Validation ◽

Spatial Clustering ◽

Regional Science ◽

Fine Tuning ◽

Clustering Methods ◽

Spatial Data Integration

AbstractThis paper is a methodological guide to using machine learning in the spatial context. It provides an overview of the existing spatial toolbox proposed in the literature: unsupervised learning, which deals with clustering of spatial data, and supervised learning, which displaces classical spatial econometrics. It shows the potential of using this developing methodology, as well as its pitfalls. It catalogues and comments on the usage of spatial clustering methods (for locations and values, both separately and jointly) for mapping, bootstrapping, cross-validation, GWR modelling and density indicators. It provides details of spatial machine learning models, which are combined with spatial data integration, modelling, model fine-tuning and predictions to deal with spatial autocorrelation and big data. The paper delineates “already available” and “forthcoming” methods and gives inspiration for transplanting modern quantitative methods from other thematic areas to research in regional science.

Download Full-text

Clustering by Detecting Density Peaks and Assigning Points by Similarity-First Search Based on Weighted K-Nearest Neighbors Graph

Complexity ◽

10.1155/2020/1731075 ◽

2020 ◽

Vol 2020 ◽

pp. 1-17

Author(s):

Qi Diao ◽

Yaping Dai ◽

Qichao An ◽

Weixing Li ◽

Xiaoxue Feng ◽

...

Keyword(s):

Clustering Algorithm ◽

Spatial Clustering ◽

Local Density ◽

Search Algorithm ◽

Real Data ◽

Nearest Neighbors ◽

Adjusted Rand Index ◽

Clustering Methods ◽

K Nearest Neighbors ◽

Density Peaks

This paper presents an improved clustering algorithm for categorizing data with arbitrary shapes. Most of the conventional clustering approaches work only with round-shaped clusters. This task can be accomplished by quickly searching and finding clustering methods for density peaks (DPC), but in some cases, it is limited by density peaks and allocation strategy. To overcome these limitations, two improvements are proposed in this paper. To describe the clustering center more comprehensively, the definitions of local density and relative distance are fused with multiple distances, including K-nearest neighbors (KNN) and shared-nearest neighbors (SNN). A similarity-first search algorithm is designed to search the most matching cluster centers for noncenter points in a weighted KNN graph. Extensive comparison with several existing DPC methods, e.g., traditional DPC algorithm, density-based spatial clustering of applications with noise (DBSCAN), affinity propagation (AP), FKNN-DPC, and K-means methods, has been carried out. Experiments based on synthetic data and real data show that the proposed clustering algorithm can outperform DPC, DBSCAN, AP, and K-means in terms of the clustering accuracy (ACC), the adjusted mutual information (AMI), and the adjusted Rand index (ARI).

Download Full-text