On the MDBSCAN Algorithm in a Spatial Data Mining Context

Author(s):  
Gabriella Schoier

The rapid developments in the availability and access to spatially referenced information in a variety of areas, has induced the need for better analysis techniques to understand the various phenomena. In particular, spatial clustering algorithms, which group similar spatial objects into classes, can be used for the identification of areas sharing common characteristics. The aim of this chapter is to present a density based algorithm for the discovery of clusters of units in large spatial data sets (MDBSCAN). This algorithm is a modification of the DBSCAN algorithm (see Ester (1996)). The modifications regard the consideration of spatial and non spatial variables and the use of a Lagrange-Chebychev metrics instead of the usual Euclidean one. The applications concern a synthetic data set and a data set of satellite images

Data Mining ◽  
2013 ◽  
pp. 435-444
Author(s):  
Gabriella Schoier

The rapid developments in the availability and access to spatially referenced information in a variety of areas, has induced the need for better analysis techniques to understand the various phenomena. In particular, spatial clustering algorithms, which group similar spatial objects into classes, can be used for the identification of areas sharing common characteristics. The aim of this chapter is to present a density based algorithm for the discovery of clusters of units in large spatial data sets (MDBSCAN). This algorithm is a modification of the DBSCAN algorithm (see Ester (1996)). The modifications regard the consideration of spatial and non spatial variables and the use of a Lagrange-Chebychev metrics instead of the usual Euclidean one. The applications concern a synthetic data set and a data set of satellite images


2016 ◽  
Vol 25 (3) ◽  
pp. 431-440 ◽  
Author(s):  
Archana Purwar ◽  
Sandeep Kumar Singh

AbstractThe quality of data is an important task in the data mining. The validity of mining algorithms is reduced if data is not of good quality. The quality of data can be assessed in terms of missing values (MV) as well as noise present in the data set. Various imputation techniques have been studied in MV study, but little attention has been given on noise in earlier work. Moreover, to the best of knowledge, no one has used density-based spatial clustering of applications with noise (DBSCAN) clustering for MV imputation. This paper proposes a novel technique density-based imputation (DBSCANI) built on density-based clustering to deal with incomplete values in the presence of noise. Density-based clustering algorithm proposed by Kriegal groups the objects according to their density in spatial data bases. The high-density regions are known as clusters, and the low-density regions refer to the noise objects in the data set. A lot of experiments have been performed on the Iris data set from life science domain and Jain’s (2D) data set from shape data sets. The performance of the proposed method is evaluated using root mean square error (RMSE) as well as it is compared with existing K-means imputation (KMI). Results show that our method is more noise resistant than KMI on data sets used under study.


2009 ◽  
pp. 2685-2705
Author(s):  
David A. Gadish

The internal validity of a spatial database can be discovered using the data contained within one or more databases. Spatial consistency includes topological consistency, or the conformance to topological rules. Discovery of inconsistencies in spatial data is an important step for improvement of spatial data quality as part of the knowledge management initiative. An approach for detecting topo-semantic inconsistencies in spatial data is presented. Inconsistencies between pairs of neighboring spatial objects are discovered by comparing relations between spatial objects to rules. A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to discover groups of neighboring objects that are inconsistent with one another can serve as the basis of an effort to understand and increase the quality of spatial data sets. Elasticity should therefore be incorporated into knowledge management systems that handle spatial data.


2010 ◽  
pp. 831-848
Author(s):  
David A. Gadish

The data quality of a vector spatial data can be assessed using the data contained within one or more data warehouses. Spatial consistency includes topological consistency, or the conformance to topological rules (Hadzilacos & Tryfona, 1992, Rodríguez, 2005). Detection of inconsistencies in vector spatial data is an important step for improvement of spatial data quality (Redman, 1992; Veregin, 1991). An approach for detecting topo-semantic inconsistencies in vector spatial data is presented. Inconsistencies between pairs of neighboring vector spatial objects are detected by comparing relations between spatial objects to rules (Klein, 2007). A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to detect groups of neighboring objects that are inconsistent with one another can later serve as the basis of an effort to increase the quality of spatial data sets stored in data warehouses, as well as increase the quality of results of data-mining processes


2013 ◽  
Vol 416-417 ◽  
pp. 1244-1250
Author(s):  
Ting Ting Zhao

With rapid development of space information crawl technology, different types of spatial database and data size of spatial database increases continuously. How to extract valuable information from complicated spatial data has become an urgent issue. Spatial data mining provides a new thought for solving the problem. The paper introduces fuzzy clustering into spatial data clustering field, studies the method that fuzzy set theory is applied to spatial data mining, proposes spatial clustering algorithm based on fuzzy similar matrix, fuzzy similarity clustering algorithm. The algorithm not only can solve the disadvantage that fuzzy clustering cant process large data set, but also can give similarity measurement between objects.


Author(s):  
G. Zhou ◽  
Q. Li ◽  
G. Deng ◽  
T. Yue ◽  
X. Zhou

The explosive growth of spatial data and widespread use of spatial databases emphasize the need for the spatial data mining. Co-location patterns discovery is an important branch in spatial data mining. Spatial co-locations represent the subsets of features which are frequently located together in geographic space. However, the appearance of a spatial feature C is often not determined by a single spatial feature A or B but by the two spatial features A and B, that is to say where A and B appear together, C often appears. We note that this co-location pattern is different from the traditional co-location pattern. Thus, this paper presents a new concept called clustering terms, and this co-location pattern is called co-location patterns with clustering items. And the traditional algorithm cannot mine this co-location pattern, so we introduce the related concept in detail and propose a novel algorithm. This algorithm is extended by join-based approach proposed by Huang. Finally, we evaluate the performance of this algorithm.


Author(s):  
A. Gadish David

The data quality of a vector spatial data can be assessed using the data contained within one or more data warehouses. Spatial consistency includes topological consistency, or the conformance to topological rules (Hadzilacos & Tryfona, 1992, Rodríguez, 2005). Detection of inconsistencies in vector spatial data is an important step for improvement of spatial data quality (Redman, 1992; Veregin, 1991). An approach for detecting topo-semantic inconsistencies in vector spatial data is presented. Inconsistencies between pairs of neighboring vector spatial objects are detected by comparing relations between spatial objects to rules (Klein, 2007). A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to detect groups of neighboring objects that are inconsistent with one another can later serve as the basis of an effort to increase the quality of spatial data sets stored in data warehouses, as well as increase the quality of results of data-mining processes.


2012 ◽  
Vol 38 (3) ◽  
pp. 98-105 ◽  
Author(s):  
Lina Papšienė ◽  
Kęstutis Papšys

Reference spatial data sets represent the least changing natural and anthropogenic features of terrine. As a rule, such data are stored in different scales and most frequently updated consequently starting with a spatial data set of a larger scale (usually base scale) thus later performing an update of data in smaller scales. The generalization of features in a larger scale is one of the major processes employed in the creation and update of spatial data of a smaller scale. In order to effectively carry out works, it is recommended to use automatic procedures and generalization only in those cases when changes in features are significant, i.e. affect the update of features in a smaller scale. The article discusses the relation between changes in polygon features (identify land cover territories in a base spatial data set) and different generalization processes as well as the evaluation of significance of likely changes.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3926 ◽  
Author(s):  
Jongwon Kim ◽  
Jeongho Cho

In spatial data with complexity, different clusters can be very contiguous, and the density of each cluster can be arbitrary and uneven. In addition, background noise that does not belong to any clusters in the data, or chain noise that connects multiple clusters may be included. This makes it difficult to separate clusters in contact with adjacent clusters, so a new approach is required to solve the nonlinear shape, irregular density, and touching problems of adjacent clusters that are common in complex spatial data clustering, as well as to improve robustness against various types of noise in spatial clusters. Accordingly, we proposed an efficient graph-based spatial clustering technique that employs Delaunay triangulation and the mechanism of DBSCAN (density-based spatial clustering of applications with noise). In the performance evaluation using simulated synthetic data as well as real 3D point clouds, the proposed method maintained better clustering and separability of neighboring clusters compared to other clustering techniques, and is expected to be of practical use in the field of spatial data mining.


Author(s):  
David A. Gadish

The internal validity of a spatial database can be discovered using the data contained within one or more databases. Spatial consistency includes topological consistency, or the conformance to topological rules. Discovery of inconsistencies in spatial data is an important step for improvement of spatial data quality as part of the knowledge management initiative. An approach for detecting topo-semantic inconsistencies in spatial data is presented. Inconsistencies between pairs of neighboring spatial objects are discovered by comparing relations between spatial objects to rules. A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to discover groups of neighboring objects that are inconsistent with one another can serve as the basis of an effort to understand and increase the quality of spatial data sets. Elasticity should therefore be incorporated into knowledge management systems that handle spatial data.


Sign in / Sign up

Export Citation Format

Share Document