spatial data mining
Recently Published Documents


TOTAL DOCUMENTS

378
(FIVE YEARS 56)

H-INDEX

20
(FIVE YEARS 2)

Author(s):  
K Laskhmaiah ◽  
◽  
S Murali Krishna ◽  
B Eswara Reddy

From massive and complex spatial database, the useful information and knowledge are extracted using spatial data mining. To analyze the complexity, efficient clustering algorithm for spatial database has been used in this area of research. The geographic areas containing spatial points are discovered using clustering methods in many applications. With spatial attributes, the spatial clustering problem have been designed using many approaches, but nonoverlapping constraints are not considered. Most existing data mining algorithms suffer in high dimensions. With nonoverlapping named as Non Overlapping Constraint based Optimized K-Means with Density and Distance-based Clustering (NOC-OKMDDC),a multidimensional optimization clustering is designed to solve this problem by the proposed system and the clusters with diverse shapes and densities in spatial databases are fast found. Proposed method consists of three main phases. Using weighted convolutional Neural Networks(Weighted CNN), attributes are reduced from the multidimensional dataset in this first phase. A partition-based algorithm (K-means) used by Optimized KMeans with Density and Distance-based Clustering (OKMDD) and several relatively small spherical or ball-shaped sub clusters are made by Clustering the dataset in this second phase. The optimal sub cluster count is performed with the help of Adaptive Adjustment Factor based Glowworm Swarm Optimization algorithm (AAFGSO). Then the proposed system designed an Enhanced Penalized Spatial Distance (EPSD) Measure to satisfy the non-overlapping condition. According to the spatial attribute values, the spatial distance between two points are well adjusted to achieving the EPSD. In third phase, to merge sub clusters the proposed system utilizes the Density based clustering with relative distance scheme. In terms of adjusted rand index, rand index, mirkins index and huberts index, better performance is achieved by proposed system when compared to the existing system which is shown by experimental result.


2021 ◽  
Author(s):  
Kamil Raczycki ◽  
Marcin Szymański ◽  
Yahor Yeliseyenka ◽  
Piotr Szymański ◽  
Tomasz Kajdanowicz

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Hao Li ◽  
Jianshu Duan ◽  
Yidan Wu ◽  
Sizhuo Gao ◽  
Ting Li

In the context of the mid-late development of China’s urbanization, promoting sustainable urban development and giving full play to urban potential have become a social focus, which is of enormous practical significance for the study of urban spatial pattern. Based on such Internet data as a map’s Point of Interest (POI), this paper studies the spatial distribution pattern and clustering characteristics of POIs of four categories of service facilities in Chengdu of Sichuan Province, including catering, shopping, transportation, scientific, educational, and cultural services, by means of spatial data mining technologies such as dimensional autocorrelation analysis and DBSCAN clustering. Global spatial autocorrelation is used to study the correlation between an index of a certain element and itself (univariate) or another index of an adjacent element (bivariate); partial spatial autocorrelation is used to identify characteristics of spatial clustering or spatial anomaly distribution of geographical elements. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is able to detect clusters of any shape without prior knowledge. The final step is to carry out quantitative analysis and reveal the distribution characteristics and coupling effects of spatial patterns. According to the results, (1) the spatial distribution of POIs of all service facilities is significantly polarized, as they are concentrated in the old city, and the trend of suburbanization is indistinctive, showing three characteristics, namely, central driving, traffic accessibility, and dependence on population activity; (2) the spatial distribution of POIs of the four categories of service facilities is featured by the pattern of “one center, multiple clusters,” where “one center” mainly covers the area within the first ring road and partial region between the first ring road and the third ring road, while “multiple clusters” are mainly distributed in the well-developed areas in the second circle of Chengdu, such as Wenjiang District and Shuangliu District; and (3) there is a significant correlation between any two categories of POIs. Highly mixed multifunctional areas are mainly distributed in the urban center, while service industry is less aggregated in urban fringe areas, and most of them are single-functional or dual-functional regions.


Author(s):  
Hassan Talebi ◽  
Luk J. M. Peeters ◽  
Alex Otto ◽  
Raimon Tolosana-Delgado

AbstractSpatial data mining helps to find hidden but potentially informative patterns from large and high-dimensional geoscience data. Non-spatial learners generally look at the observations based on their relationships in the feature space, which means that they cannot consider spatial relationships between regionalised variables. This study introduces a novel spatial random forests technique based on higher-order spatial statistics for analysis and modelling of spatial data. Unlike the classical random forests algorithm that uses pixelwise spectral information as predictors, the proposed spatial random forests algorithm uses the local spatial-spectral information (i.e., vectorised spatial patterns) to learn intrinsic heterogeneity, spatial dependencies, and complex spatial patterns. Algorithms for supervised (i.e., regression and classification) and unsupervised (i.e., dimension reduction and clustering) learning are presented. Approaches to deal with big data, multi-resolution data, and missing values are discussed. The superior performance and usefulness of the proposed algorithm over the classical random forests method are illustrated via synthetic and real cases, where the remotely sensed geophysical covariates in North West Minerals Province of Queensland, Australia, are used as input spatial data for geology mapping, geochemical prediction, and process discovery analysis.


2021 ◽  
pp. 1-18
Author(s):  
Trang T.D. Nguyen ◽  
Loan T.T. Nguyen ◽  
Anh Nguyen ◽  
Unil Yun ◽  
Bay Vo

Spatial clustering is one of the main techniques for spatial data mining and spatial data analysis. However, existing spatial clustering methods primarily focus on points distributed in planar space with the Euclidean distance measurement. Recently, NS-DBSCAN has been developed to perform clustering of spatial point events in Network Space based on a well-known clustering algorithm, named Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The NS-DBSCAN algorithm has efficiently solved the problem of clustering network constrained spatial points. When compared to the NC_DT (Network-Constraint Delaunay Triangulation) clustering algorithm, the NS-DBSCAN algorithm efficiently solves the problem of clustering network constrained spatial points by visualizing the intrinsic clustering structure of spatial data by constructing density ordering charts. However, the main drawback of this algorithm is when the data are processed, objects that are not specifically categorized into types of clusters cannot be removed, which is undeniably a waste of time, particularly when the dataset is large. In an attempt to have this algorithm work with great efficiency, we thus recommend removing edges that are longer than the threshold and eliminating low-density points from the density ordering table when forming clusters and also take other effective techniques into consideration. In this paper, we develop a theorem to determine the maximum length of an edge in a road segment. Based on this theorem, an algorithm is proposed to greatly improve the performance of the density-based clustering algorithm in network space (NS-DBSCAN). Experiments using our proposed algorithm carried out in collaboration with Ho Chi Minh City, Vietnam yield the same results but shows an advantage of it over NS-DBSCAN in execution time.


2021 ◽  
Vol 13 (5) ◽  
pp. 960
Author(s):  
Guoqing Zhou ◽  
Qi Li ◽  
Guangming Deng

The explosive growth of spatial data and the widespread use of spatial databases emphasize the need for spatial data mining. The subsets of features frequently located together in a geographic space are called spatial co-location patterns. It is difficult to discover co-location patterns because of the huge amount of data brought by the instances of spatial features. A large fraction of the computation time is devoted to generating row instances and candidate co-location patterns. This paper makes three main contributions for mining co-location patterns. First, the definition of maximal instances is given and a row instance (RI)-tree is constructed to find maximal instances from a spatial data set. Second, a fast method for generating all row instances and candidate co-locations is proposed and the feasibility of this method is proved. Third, a maximal instance algorithm with no join operations for mining co-location patterns is proposed. Finally, experimental evaluations using synthetic data sets and a real data set show that maximal instance algorithm is feasible and has better performance.


2021 ◽  
pp. 103-109
Author(s):  
Regin R ◽  
Suman Rajest S ◽  
Bhopendra Singh

This article reviews the approaches used in data mining to perform a geographical study of regional datasets coupled with Geographic Information Systems (GIS). Firstly, we can look at the functions of data mining used by such data and then illustrate their precision compared to their classic data use. We will further explain the research conducted in this sector and point out that two separate methods exist: one is focused on space database learning while the other is based on space statistics. Finally, we will address the key distinctions between these two methods and their similar features.


2021 ◽  
Vol 10 (2) ◽  
pp. 79
Author(s):  
Ching-Yun Mu ◽  
Tien-Yin Chou ◽  
Thanh Van Hoang ◽  
Pin Kung ◽  
Yao-Min Fang ◽  
...  

Spatial information technology has been widely used for vehicles in general and for fleet management. Many studies have focused on improving vehicle positioning accuracy, although few studies have focused on efficiency improvements for managing large truck fleets in the context of the current complex network of roads. Therefore, this paper proposes a multilayer-based map matching algorithm with different spatial data structures to deal rapidly with large amounts of coordinate data. Using the dimension reduction technique, the geodesic coordinates can be transformed into plane coordinates. This study provides multiple layer grouping combinations to deal with complex road networks. We integrated these techniques and employed a puncture method to process the geometric computation with spatial data-mining approaches. We constructed a spatial division index and combined this with the puncture method, which improves the efficiency of the system and can enhance data retrieval efficiency for large truck fleet dispatching. This paper also used a multilayer-based map matching algorithm with raster data structures. Comparing the results revealed that the look-up table method offers the best outcome. The proposed multilayer-based map matching algorithm using the look-up table method is suited to obtaining competitive performance in identifying efficiency improvements for large truck fleet dispatching.


2021 ◽  
pp. 173-178
Author(s):  
Yuri Calleo ◽  
Simone Di Zio

In the context of Futures Studies, the scenario development process permits to make assumptions on what the futures can be in order to support better today decisions. In the initial stages of the scenario building (Framing and Scanning phases), the process requires much time and efforts to scanning data and information (reading of documents, literature review and consultation of experts) to understand more about the object of the foresight study. The daily use of social networks causes an exponential increase of data and for this reason here we deal with the problem of speeding up and optimizing the Scanning phase by applying a new combined method based on the analysis of tweets with the use of unsupervised classification models, text-mining and spatial data mining techniques. For the purpose of having a qualitative overview, we applied the bag-of-words model and a Sentiment Analysis with the Afinn and Vader algorithms. Then, in order to extrapolate the influence factors, and the relevant key factors (Kayser and Blind, 2017; 2020) the Latent Dirichlet Allocation (LDA) was used (Tong and Zhang, 2016). Furthermore, to acquire also spatial information we used spatial data mining technique to extract georeferenced data from which it was possible to analyse and obtain a geographic analysis of the data. To showcase our method, we provide an example using Covid-19 tweets (Uhl and Schiebel, 2017), upon which 5 topics and 6 key factors have been extracted. In the last instance, for each influence factor, a cartogram was created through the relative frequencies in order to have a spatial distribution of the users discussing each particular topic. The results fully answer the research objectives and the model used could be a new approach that can offer benefits in the scenario developments process.


Author(s):  
Elena Brekotkina ◽  
Ramil Gilyazov ◽  
Sergey Pavlov ◽  
Vladislav Trubin ◽  
Olga Khristodulo

Management of complex distributed systems, at present, is carried out using information systems designed to collect and manage a large amount of heterogeneous information of an organization or institution. The technical component of such information systems is itself a complex distributed system, and information about the location and mutual arrangement of its individual components, as well as the location of the associated infrastructure and main users, is essential for its high-quality functioning. A method of spatial data mining that suggested in this paper, proposed for formalizing the process of creating specialized databases for the implementation (creation) of individual subsystems as part of complex distributed information systems. The results of the application of this method for the construction of one of the subsystems, namely, the subsystem of information support for the management of the technical component of the information system, as part of a complex distributed information system, are presented on the example of the geoinformation system of the Ufa State Aviation Technical University. The proposed approach to intelligent analysis and representation of spatial data, consisting in a formalized description of various thematic data applied to solving specific problems of information management support in complex distributed systems, can be applied to the construction of complex information systems in significantly distributed organizational and infrastructure.


Sign in / Sign up

Export Citation Format

Share Document