On the MDBSCAN Algorithm in a Spatial Data Mining Context

Data Mining ◽

10.4018/978-1-4666-2455-9.ch021 ◽

2013 ◽

pp. 435-444

Author(s):

Gabriella Schoier

Keyword(s):

Spatial Data ◽

Spatial Clustering ◽

Clustering Algorithms ◽

Synthetic Data ◽

Spatial Data Mining ◽

Data Sets ◽

Data Set ◽

Spatial Variables ◽

Spatial Objects ◽

Spatial Data Sets

The rapid developments in the availability and access to spatially referenced information in a variety of areas, has induced the need for better analysis techniques to understand the various phenomena. In particular, spatial clustering algorithms, which group similar spatial objects into classes, can be used for the identification of areas sharing common characteristics. The aim of this chapter is to present a density based algorithm for the discovery of clusters of units in large spatial data sets (MDBSCAN). This algorithm is a modification of the DBSCAN algorithm (see Ester (1996)). The modifications regard the consideration of spatial and non spatial variables and the use of a Lagrange-Chebychev metrics instead of the usual Euclidean one. The applications concern a synthetic data set and a data set of satellite images

Download Full-text

DBSCANI: Noise-Resistant Method for Missing Value Imputation

Journal of Intelligent Systems ◽

10.1515/jisys-2014-0172 ◽

2016 ◽

Vol 25 (3) ◽

pp. 431-440 ◽

Cited By ~ 1

Author(s):

Archana Purwar ◽

Sandeep Kumar Singh

Keyword(s):

Spatial Data ◽

Missing Values ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Data Sets ◽

Quality Of Data ◽

Data Set ◽

Dbscan Clustering ◽

Density Based Clustering

AbstractThe quality of data is an important task in the data mining. The validity of mining algorithms is reduced if data is not of good quality. The quality of data can be assessed in terms of missing values (MV) as well as noise present in the data set. Various imputation techniques have been studied in MV study, but little attention has been given on noise in earlier work. Moreover, to the best of knowledge, no one has used density-based spatial clustering of applications with noise (DBSCAN) clustering for MV imputation. This paper proposes a novel technique density-based imputation (DBSCANI) built on density-based clustering to deal with incomplete values in the presence of noise. Density-based clustering algorithm proposed by Kriegal groups the objects according to their density in spatial data bases. The high-density regions are known as clusters, and the low-density regions refer to the noise objects in the data set. A lot of experiments have been performed on the Iris data set from life science domain and Jain’s (2D) data set from shape data sets. The performance of the proposed method is evaluated using root mean square error (RMSE) as well as it is compared with existing K-means imputation (KMI). Results show that our method is more noise resistant than KMI on data sets used under study.

Download Full-text

Introducing Elasticity for Spatial Knowledge Management

Database Technologies ◽

10.4018/978-1-60566-058-5.ch160 ◽

2009 ◽

pp. 2685-2705

Author(s):

David A. Gadish

Keyword(s):

Knowledge Management ◽

Spatial Data ◽

Internal Validity ◽

Spatial Knowledge ◽

Data Sets ◽

Multiple Objects ◽

Spatial Objects ◽

Spatial Consistency ◽

Spatial Data Sets

The internal validity of a spatial database can be discovered using the data contained within one or more databases. Spatial consistency includes topological consistency, or the conformance to topological rules. Discovery of inconsistencies in spatial data is an important step for improvement of spatial data quality as part of the knowledge management initiative. An approach for detecting topo-semantic inconsistencies in spatial data is presented. Inconsistencies between pairs of neighboring spatial objects are discovered by comparing relations between spatial objects to rules. A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to discover groups of neighboring objects that are inconsistent with one another can serve as the basis of an effort to understand and increase the quality of spatial data sets. Elasticity should therefore be incorporated into knowledge management systems that handle spatial data.

Download Full-text

Introducing the Elasticity of Spatial Data

Business Information Systems ◽

10.4018/978-1-61520-969-9.ch051 ◽

2010 ◽

pp. 831-848

Author(s):

David A. Gadish

Keyword(s):

Data Quality ◽

Spatial Data ◽

Data Sets ◽

Data Warehouses ◽

Multiple Objects ◽

Quality Of Results ◽

Spatial Objects ◽

Spatial Consistency ◽

Spatial Data Sets

The data quality of a vector spatial data can be assessed using the data contained within one or more data warehouses. Spatial consistency includes topological consistency, or the conformance to topological rules (Hadzilacos & Tryfona, 1992, Rodríguez, 2005). Detection of inconsistencies in vector spatial data is an important step for improvement of spatial data quality (Redman, 1992; Veregin, 1991). An approach for detecting topo-semantic inconsistencies in vector spatial data is presented. Inconsistencies between pairs of neighboring vector spatial objects are detected by comparing relations between spatial objects to rules (Klein, 2007). A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to detect groups of neighboring objects that are inconsistent with one another can later serve as the basis of an effort to increase the quality of spatial data sets stored in data warehouses, as well as increase the quality of results of data-mining processes

Download Full-text

Study on Fuzzy Clustering Algorithm of Spatial Data Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.416-417.1244 ◽

2013 ◽

Vol 416-417 ◽

pp. 1244-1250

Author(s):

Ting Ting Zhao

Keyword(s):

Data Mining ◽

Fuzzy Clustering ◽

Spatial Data ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Rapid Development ◽

Spatial Database ◽

Spatial Data Mining ◽

Data Set ◽

Fuzzy Similarity

With rapid development of space information crawl technology, different types of spatial database and data size of spatial database increases continuously. How to extract valuable information from complicated spatial data has become an urgent issue. Spatial data mining provides a new thought for solving the problem. The paper introduces fuzzy clustering into spatial data clustering field, studies the method that fuzzy set theory is applied to spatial data mining, proposes spatial clustering algorithm based on fuzzy similar matrix, fuzzy similarity clustering algorithm. The algorithm not only can solve the disadvantage that fuzzy clustering cant process large data set, but also can give similarity measurement between objects.

Download Full-text

MINING CO-LOCATION PATTERNS WITH CLUSTERING ITEMS FROM SPATIAL DATA SETS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-2505-2018 ◽

2018 ◽

Vol XLII-3 ◽

pp. 2505-2509

Author(s):

G. Zhou ◽

Q. Li ◽

G. Deng ◽

T. Yue ◽

X. Zhou

Keyword(s):

Data Mining ◽

Spatial Data ◽

Spatial Databases ◽

Spatial Data Mining ◽

Data Sets ◽

Location Pattern ◽

Spatial Feature ◽

Geographic Space ◽

Location Patterns ◽

Spatial Data Sets

The explosive growth of spatial data and widespread use of spatial databases emphasize the need for the spatial data mining. Co-location patterns discovery is an important branch in spatial data mining. Spatial co-locations represent the subsets of features which are frequently located together in geographic space. However, the appearance of a spatial feature C is often not determined by a single spatial feature A or B but by the two spatial features A and B, that is to say where A and B appear together, C often appears. We note that this co-location pattern is different from the traditional co-location pattern. Thus, this paper presents a new concept called clustering terms, and this co-location pattern is called co-location patterns with clustering items. And the traditional algorithm cannot mine this co-location pattern, so we introduce the related concept in detail and propose a novel algorithm. This algorithm is extended by join-based approach proposed by Huang. Finally, we evaluate the performance of this algorithm.

Download Full-text

Introducing the Elasticity of Spatial Data

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies ◽

10.4018/978-1-60566-717-1.ch011 ◽

2011 ◽

pp. 198-215

Author(s):

A. Gadish David

Keyword(s):

Data Quality ◽

Spatial Data ◽

Data Sets ◽

Data Warehouses ◽

Multiple Objects ◽

Quality Of Results ◽

Spatial Objects ◽

Spatial Consistency ◽

Spatial Data Sets

The data quality of a vector spatial data can be assessed using the data contained within one or more data warehouses. Spatial consistency includes topological consistency, or the conformance to topological rules (Hadzilacos & Tryfona, 1992, Rodríguez, 2005). Detection of inconsistencies in vector spatial data is an important step for improvement of spatial data quality (Redman, 1992; Veregin, 1991). An approach for detecting topo-semantic inconsistencies in vector spatial data is presented. Inconsistencies between pairs of neighboring vector spatial objects are detected by comparing relations between spatial objects to rules (Klein, 2007). A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to detect groups of neighboring objects that are inconsistent with one another can later serve as the basis of an effort to increase the quality of spatial data sets stored in data warehouses, as well as increase the quality of results of data-mining processes.

Download Full-text

CHANGES AFFECTING GENERALIZATION OF LAND COVER FEATURES IN A SMALLER SCALE

Geodesy and Cartography ◽

10.3846/20296991.2012.728045 ◽

2012 ◽

Vol 38 (3) ◽

pp. 98-105 ◽

Cited By ~ 2

Author(s):

Lina Papšienė ◽

Kęstutis Papšys

Keyword(s):

Land Cover ◽

Spatial Data ◽

Data Sets ◽

Data Set ◽

The Creation ◽

Spatial Data Sets

Reference spatial data sets represent the least changing natural and anthropogenic features of terrine. As a rule, such data are stored in different scales and most frequently updated consequently starting with a spatial data set of a larger scale (usually base scale) thus later performing an update of data in smaller scales. The generalization of features in a larger scale is one of the major processes employed in the creation and update of spatial data of a smaller scale. In order to effectively carry out works, it is recommended to use automatic procedures and generalization only in those cases when changes in features are significant, i.e. affect the update of features in a smaller scale. The article discusses the relation between changes in polygon features (identify land cover territories in a base spatial data set) and different generalization processes as well as the evaluation of significance of likely changes.

Download Full-text

Delaunay Triangulation-Based Spatial Clustering Technique for Enhanced Adjacent Boundary Detection and Segmentation of LiDAR 3D Point Clouds

Sensors ◽

10.3390/s19183926 ◽

2019 ◽

Vol 19 (18) ◽

pp. 3926 ◽

Cited By ~ 2

Author(s):

Jongwon Kim ◽

Jeongho Cho

Keyword(s):

Delaunay Triangulation ◽

Spatial Data ◽

Spatial Clustering ◽

Synthetic Data ◽

Point Clouds ◽

Spatial Data Mining ◽

New Approach ◽

Clustering Technique ◽

3D Point Clouds ◽

Multiple Clusters

In spatial data with complexity, different clusters can be very contiguous, and the density of each cluster can be arbitrary and uneven. In addition, background noise that does not belong to any clusters in the data, or chain noise that connects multiple clusters may be included. This makes it difficult to separate clusters in contact with adjacent clusters, so a new approach is required to solve the nonlinear shape, irregular density, and touching problems of adjacent clusters that are common in complex spatial data clustering, as well as to improve robustness against various types of noise in spatial clusters. Accordingly, we proposed an efficient graph-based spatial clustering technique that employs Delaunay triangulation and the mechanism of DBSCAN (density-based spatial clustering of applications with noise). In the performance evaluation using simulated synthetic data as well as real 3D point clouds, the proposed method maintained better clustering and separability of neighboring clusters compared to other clustering techniques, and is expected to be of practical use in the field of spatial data mining.

Download Full-text

Introducing Elasticity for Spatial Knowledge Management

Ubiquitous Developments in Knowledge Management ◽

10.4018/978-1-60566-954-0.ch018 ◽

2010 ◽

pp. 282-299

Author(s):

David A. Gadish

Keyword(s):

Knowledge Management ◽

Spatial Data ◽

Internal Validity ◽

Spatial Knowledge ◽

Data Sets ◽

Multiple Objects ◽

Spatial Objects ◽

Spatial Consistency ◽

Spatial Data Sets

The internal validity of a spatial database can be discovered using the data contained within one or more databases. Spatial consistency includes topological consistency, or the conformance to topological rules. Discovery of inconsistencies in spatial data is an important step for improvement of spatial data quality as part of the knowledge management initiative. An approach for detecting topo-semantic inconsistencies in spatial data is presented. Inconsistencies between pairs of neighboring spatial objects are discovered by comparing relations between spatial objects to rules. A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to discover groups of neighboring objects that are inconsistent with one another can serve as the basis of an effort to understand and increase the quality of spatial data sets. Elasticity should therefore be incorporated into knowledge management systems that handle spatial data.

Download Full-text