spatial data mining Latest Research Papers

An Optimized K-means with Density and Distance-Based Clustering Algorithm for Multidimensional Spatial Databases

International Journal of Computer Network and Information Security ◽

10.5815/ijcnis.2021.06.06 ◽

2021 ◽

Vol 13 (6) ◽

pp. 70-82

Author(s):

K Laskhmaiah ◽

◽

S Murali Krishna ◽

B Eswara Reddy

Keyword(s):

Data Mining ◽

Clustering Algorithm ◽

Spatial Databases ◽

Spatial Database ◽

Spatial Data Mining ◽

Rand Index ◽

Spatial Distance ◽

Experimental Result ◽

Adjusted Rand Index ◽

Second Phase

From massive and complex spatial database, the useful information and knowledge are extracted using spatial data mining. To analyze the complexity, efficient clustering algorithm for spatial database has been used in this area of research. The geographic areas containing spatial points are discovered using clustering methods in many applications. With spatial attributes, the spatial clustering problem have been designed using many approaches, but nonoverlapping constraints are not considered. Most existing data mining algorithms suffer in high dimensions. With nonoverlapping named as Non Overlapping Constraint based Optimized K-Means with Density and Distance-based Clustering (NOC-OKMDDC),a multidimensional optimization clustering is designed to solve this problem by the proposed system and the clusters with diverse shapes and densities in spatial databases are fast found. Proposed method consists of three main phases. Using weighted convolutional Neural Networks(Weighted CNN), attributes are reduced from the multidimensional dataset in this first phase. A partition-based algorithm (K-means) used by Optimized KMeans with Density and Distance-based Clustering (OKMDD) and several relatively small spherical or ball-shaped sub clusters are made by Clustering the dataset in this second phase. The optimal sub cluster count is performed with the help of Adaptive Adjustment Factor based Glowworm Swarm Optimization algorithm (AAFGSO). Then the proposed system designed an Enhanced Penalized Spatial Distance (EPSD) Measure to satisfy the non-overlapping condition. According to the spatial attribute values, the spatial distance between two points are well adjusted to achieving the EPSD. In third phase, to merge sub clusters the proposed system utilizes the Density based clustering with relative distance scheme. In terms of adjusted rand index, rand index, mirkins index and huberts index, better performance is achieved by proposed system when compared to the existing system which is shown by experimental result.

Spatial data mining of public transport incidents reported in social media

10.1145/3486629.3490696 ◽

2021 ◽

Author(s):

Kamil Raczycki ◽

Marcin Szymański ◽

Yahor Yeliseyenka ◽

Piotr Szymański ◽

Tomasz Kajdanowicz

Keyword(s):

Data Mining ◽

Social Media ◽

Spatial Data ◽

Public Transport ◽

Spatial Data Mining

The Spatial Patterns of Service Facilities Based on Internet Big Data: A Case Study on Chengdu

Mathematical Problems in Engineering ◽

10.1155/2021/9283185 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Hao Li ◽

Jianshu Duan ◽

Yidan Wu ◽

Sizhuo Gao ◽

Ting Li

Keyword(s):

Spatial Distribution ◽

Spatial Autocorrelation ◽

Spatial Patterns ◽

Spatial Data ◽

Spatial Clustering ◽

Spatial Data Mining ◽

Sustainable Urban Development ◽

Practical Significance ◽

Ring Road ◽

Multiple Clusters

In the context of the mid-late development of China’s urbanization, promoting sustainable urban development and giving full play to urban potential have become a social focus, which is of enormous practical significance for the study of urban spatial pattern. Based on such Internet data as a map’s Point of Interest (POI), this paper studies the spatial distribution pattern and clustering characteristics of POIs of four categories of service facilities in Chengdu of Sichuan Province, including catering, shopping, transportation, scientific, educational, and cultural services, by means of spatial data mining technologies such as dimensional autocorrelation analysis and DBSCAN clustering. Global spatial autocorrelation is used to study the correlation between an index of a certain element and itself (univariate) or another index of an adjacent element (bivariate); partial spatial autocorrelation is used to identify characteristics of spatial clustering or spatial anomaly distribution of geographical elements. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is able to detect clusters of any shape without prior knowledge. The final step is to carry out quantitative analysis and reveal the distribution characteristics and coupling effects of spatial patterns. According to the results, (1) the spatial distribution of POIs of all service facilities is significantly polarized, as they are concentrated in the old city, and the trend of suburbanization is indistinctive, showing three characteristics, namely, central driving, traffic accessibility, and dependence on population activity; (2) the spatial distribution of POIs of the four categories of service facilities is featured by the pattern of “one center, multiple clusters,” where “one center” mainly covers the area within the first ring road and partial region between the first ring road and the third ring road, while “multiple clusters” are mainly distributed in the well-developed areas in the second circle of Chengdu, such as Wenjiang District and Shuangliu District; and (3) there is a significant correlation between any two categories of POIs. Highly mixed multifunctional areas are mainly distributed in the urban center, while service industry is less aggregated in urban fringe areas, and most of them are single-functional or dual-functional regions.

A Truly Spatial Random Forests Algorithm for Geoscience Data Analysis and Modelling

Mathematical Geosciences ◽

10.1007/s11004-021-09946-w ◽

2021 ◽

Author(s):

Hassan Talebi ◽

Luk J. M. Peeters ◽

Alex Otto ◽

Raimon Tolosana-Delgado

Keyword(s):

Spatial Patterns ◽

Random Forests ◽

Spatial Data ◽

Missing Values ◽

Feature Space ◽

Spatial Data Mining ◽

Superior Performance ◽

Spectral Information ◽

North West ◽

Spatial Dependencies

AbstractSpatial data mining helps to find hidden but potentially informative patterns from large and high-dimensional geoscience data. Non-spatial learners generally look at the observations based on their relationships in the feature space, which means that they cannot consider spatial relationships between regionalised variables. This study introduces a novel spatial random forests technique based on higher-order spatial statistics for analysis and modelling of spatial data. Unlike the classical random forests algorithm that uses pixelwise spectral information as predictors, the proposed spatial random forests algorithm uses the local spatial-spectral information (i.e., vectorised spatial patterns) to learn intrinsic heterogeneity, spatial dependencies, and complex spatial patterns. Algorithms for supervised (i.e., regression and classification) and unsupervised (i.e., dimension reduction and clustering) learning are presented. Approaches to deal with big data, multi-resolution data, and missing values are discussed. The superior performance and usefulness of the proposed algorithm over the classical random forests method are illustrated via synthetic and real cases, where the remotely sensed geophysical covariates in North West Minerals Province of Queensland, Australia, are used as input spatial data for geology mapping, geochemical prediction, and process discovery analysis.

A method for efficient clustering of spatial data in network space

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202806 ◽

2021 ◽

pp. 1-18

Author(s):

Trang T.D. Nguyen ◽

Loan T.T. Nguyen ◽

Anh Nguyen ◽

Unil Yun ◽

Bay Vo

Keyword(s):

Spatial Data ◽

Euclidean Distance ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Spatial Data Mining ◽

Spatial Data Analysis ◽

Clustering Methods ◽

Dbscan Algorithm ◽

Density Based Clustering ◽

Network Space

Spatial clustering is one of the main techniques for spatial data mining and spatial data analysis. However, existing spatial clustering methods primarily focus on points distributed in planar space with the Euclidean distance measurement. Recently, NS-DBSCAN has been developed to perform clustering of spatial point events in Network Space based on a well-known clustering algorithm, named Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The NS-DBSCAN algorithm has efficiently solved the problem of clustering network constrained spatial points. When compared to the NC_DT (Network-Constraint Delaunay Triangulation) clustering algorithm, the NS-DBSCAN algorithm efficiently solves the problem of clustering network constrained spatial points by visualizing the intrinsic clustering structure of spatial data by constructing density ordering charts. However, the main drawback of this algorithm is when the data are processed, objects that are not specifically categorized into types of clusters cannot be removed, which is undeniably a waste of time, particularly when the dataset is large. In an attempt to have this algorithm work with great efficiency, we thus recommend removing edges that are longer than the threshold and eliminating low-density points from the density ordering table when forming clusters and also take other effective techniques into consideration. In this paper, we develop a theorem to determine the maximum length of an edge in a road segment. Based on this theorem, an algorithm is proposed to greatly improve the performance of the density-based clustering algorithm in network space (NS-DBSCAN). Experiments using our proposed algorithm carried out in collaboration with Ho Chi Minh City, Vietnam yield the same results but shows an advantage of it over NS-DBSCAN in execution time.

Maximal Instance Algorithm for Fast Mining of Spatial Co-Location Patterns

Remote Sensing ◽

10.3390/rs13050960 ◽

2021 ◽

Vol 13 (5) ◽

pp. 960

Author(s):

Guoqing Zhou ◽

Qi Li ◽

Guangming Deng

Keyword(s):

Spatial Data ◽

Spatial Databases ◽

Large Fraction ◽

Synthetic Data ◽

Computation Time ◽

Real Data ◽

Spatial Data Mining ◽

Fast Method ◽

Data Set ◽

Location Patterns

The explosive growth of spatial data and the widespread use of spatial databases emphasize the need for spatial data mining. The subsets of features frequently located together in a geographic space are called spatial co-location patterns. It is difficult to discover co-location patterns because of the huge amount of data brought by the instances of spatial features. A large fraction of the computation time is devoted to generating row instances and candidate co-location patterns. This paper makes three main contributions for mining co-location patterns. First, the definition of maximal instances is given and a row instance (RI)-tree is constructed to find maximal instances from a spatial data set. Second, a fast method for generating all row instances and candidate co-locations is proposed and the feasibility of this method is proved. Third, a maximal instance algorithm with no join operations for mining co-location patterns is proposed. Finally, experimental evaluations using synthetic data sets and a real data set show that maximal instance algorithm is feasible and has better performance.

Spatial Data Mining Methods Databases and Statistics Point of Views

10.46532/978-81-950008-7-6_010 ◽

2021 ◽

pp. 103-109

Author(s):

Regin R ◽

Suman Rajest S ◽

Bhopendra Singh

Keyword(s):

Data Mining ◽

Information Systems ◽

Geographic Information Systems ◽

Spatial Data ◽

Data Use ◽

Spatial Data Mining ◽

Geographic Information ◽

The Other ◽

Geographical Study ◽

Mining Methods

This article reviews the approaches used in data mining to perform a geographical study of regional datasets coupled with Geographic Information Systems (GIS). Firstly, we can look at the functions of data mining used by such data and then illustrate their precision compared to their classic data use. We will further explain the research conducted in this sector and point out that two separate methods exist: one is focused on space database learning while the other is based on space statistics. Finally, we will address the key distinctions between these two methods and their similar features.

Development of Multilayer-Based Map Matching to Enhance Performance in Large Truck Fleet Dispatching

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020079 ◽

2021 ◽

Vol 10 (2) ◽

pp. 79

Author(s):

Ching-Yun Mu ◽

Tien-Yin Chou ◽

Thanh Van Hoang ◽

Pin Kung ◽

Yao-Min Fang ◽

...

Keyword(s):

Data Structures ◽

Spatial Data ◽

Spatial Information ◽

Data Retrieval ◽

Spatial Data Mining ◽

Map Matching ◽

Matching Algorithm ◽

Table Method ◽

Look Up Table ◽

The Look

Spatial information technology has been widely used for vehicles in general and for fleet management. Many studies have focused on improving vehicle positioning accuracy, although few studies have focused on efficiency improvements for managing large truck fleets in the context of the current complex network of roads. Therefore, this paper proposes a multilayer-based map matching algorithm with different spatial data structures to deal rapidly with large amounts of coordinate data. Using the dimension reduction technique, the geodesic coordinates can be transformed into plane coordinates. This study provides multiple layer grouping combinations to deal with complex road networks. We integrated these techniques and employed a puncture method to process the geometric computation with spatial data-mining approaches. We constructed a spatial division index and combined this with the puncture method, which improves the efficiency of the system and can enhance data retrieval efficiency for large truck fleet dispatching. This paper also used a multilayer-based map matching algorithm with raster data structures. Comparing the results revealed that the look-up table method offers the best outcome. The proposed multilayer-based map matching algorithm using the look-up table method is suited to obtaining competitive performance in identifying efficiency improvements for large truck fleet dispatching.

Unsupervised spatial data mining for the development of future scenarios: a Covid-19 application

10.36253/978-88-5518-461-8.33 ◽

2021 ◽

pp. 173-178

Author(s):

Yuri Calleo ◽

Simone Di Zio

Keyword(s):

Data Mining ◽

Spatial Data ◽

Latent Dirichlet Allocation ◽

Spatial Information ◽

Influence Factor ◽

Influence Factors ◽

Spatial Data Mining ◽

Key Factors ◽

Scenario Development ◽

Data Mining Technique

In the context of Futures Studies, the scenario development process permits to make assumptions on what the futures can be in order to support better today decisions. In the initial stages of the scenario building (Framing and Scanning phases), the process requires much time and efforts to scanning data and information (reading of documents, literature review and consultation of experts) to understand more about the object of the foresight study. The daily use of social networks causes an exponential increase of data and for this reason here we deal with the problem of speeding up and optimizing the Scanning phase by applying a new combined method based on the analysis of tweets with the use of unsupervised classification models, text-mining and spatial data mining techniques. For the purpose of having a qualitative overview, we applied the bag-of-words model and a Sentiment Analysis with the Afinn and Vader algorithms. Then, in order to extrapolate the influence factors, and the relevant key factors (Kayser and Blind, 2017; 2020) the Latent Dirichlet Allocation (LDA) was used (Tong and Zhang, 2016). Furthermore, to acquire also spatial information we used spatial data mining technique to extract georeferenced data from which it was possible to analyse and obtain a geographic analysis of the data. To showcase our method, we provide an example using Covid-19 tweets (Uhl and Schiebel, 2017), upon which 5 topics and 6 key factors have been extracted. In the last instance, for each influence factor, a cartogram was created through the relative frequencies in order to have a spatial distribution of the users discussing each particular topic. The results fully answer the research objectives and the model used could be a new approach that can offer benefits in the scenario developments process.

Prediction of Nth friends using spatial data mining in social networks

International Journal of Advanced Intelligence Paradigms ◽

10.1504/ijaip.2021.116368 ◽

2021 ◽

Vol 19 (3/4) ◽

pp. 410

Author(s):

D. Gandhimathi ◽

A. John Sanjeev Kumar

Keyword(s):

Data Mining ◽

Social Networks ◽

Spatial Data ◽

Spatial Data Mining

spatial data mining
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

An Optimized K-means with Density and Distance-Based Clustering Algorithm for Multidimensional Spatial Databases

Spatial data mining of public transport incidents reported in social media

The Spatial Patterns of Service Facilities Based on Internet Big Data: A Case Study on Chengdu

A Truly Spatial Random Forests Algorithm for Geoscience Data Analysis and Modelling

A method for efficient clustering of spatial data in network space

Maximal Instance Algorithm for Fast Mining of Spatial Co-Location Patterns

Spatial Data Mining Methods Databases and Statistics Point of Views

Development of Multilayer-Based Map Matching to Enhance Performance in Large Truck Fleet Dispatching

Unsupervised spatial data mining for the development of future scenarios: a Covid-19 application

Prediction of Nth friends using spatial data mining in social networks

Export Citation Format

spatial data miningRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

An Optimized K-means with Density and Distance-Based Clustering Algorithm for Multidimensional Spatial Databases

Spatial data mining of public transport incidents reported in social media

The Spatial Patterns of Service Facilities Based on Internet Big Data: A Case Study on Chengdu

A Truly Spatial Random Forests Algorithm for Geoscience Data Analysis and Modelling

A method for efficient clustering of spatial data in network space

Maximal Instance Algorithm for Fast Mining of Spatial Co-Location Patterns

Spatial Data Mining Methods Databases and Statistics Point of Views

Development of Multilayer-Based Map Matching to Enhance Performance in Large Truck Fleet Dispatching

Unsupervised spatial data mining for the development of future scenarios: a Covid-19 application

Prediction of Nth friends using spatial data mining in social networks

spatial data mining
Recently Published Documents