Introducing the Elasticity of Spatial Data

Author(s):  
A. Gadish David

The data quality of a vector spatial data can be assessed using the data contained within one or more data warehouses. Spatial consistency includes topological consistency, or the conformance to topological rules (Hadzilacos & Tryfona, 1992, Rodríguez, 2005). Detection of inconsistencies in vector spatial data is an important step for improvement of spatial data quality (Redman, 1992; Veregin, 1991). An approach for detecting topo-semantic inconsistencies in vector spatial data is presented. Inconsistencies between pairs of neighboring vector spatial objects are detected by comparing relations between spatial objects to rules (Klein, 2007). A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to detect groups of neighboring objects that are inconsistent with one another can later serve as the basis of an effort to increase the quality of spatial data sets stored in data warehouses, as well as increase the quality of results of data-mining processes.

2010 ◽  
pp. 831-848
Author(s):  
David A. Gadish

The data quality of a vector spatial data can be assessed using the data contained within one or more data warehouses. Spatial consistency includes topological consistency, or the conformance to topological rules (Hadzilacos & Tryfona, 1992, Rodríguez, 2005). Detection of inconsistencies in vector spatial data is an important step for improvement of spatial data quality (Redman, 1992; Veregin, 1991). An approach for detecting topo-semantic inconsistencies in vector spatial data is presented. Inconsistencies between pairs of neighboring vector spatial objects are detected by comparing relations between spatial objects to rules (Klein, 2007). A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to detect groups of neighboring objects that are inconsistent with one another can later serve as the basis of an effort to increase the quality of spatial data sets stored in data warehouses, as well as increase the quality of results of data-mining processes


2009 ◽  
pp. 2685-2705
Author(s):  
David A. Gadish

The internal validity of a spatial database can be discovered using the data contained within one or more databases. Spatial consistency includes topological consistency, or the conformance to topological rules. Discovery of inconsistencies in spatial data is an important step for improvement of spatial data quality as part of the knowledge management initiative. An approach for detecting topo-semantic inconsistencies in spatial data is presented. Inconsistencies between pairs of neighboring spatial objects are discovered by comparing relations between spatial objects to rules. A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to discover groups of neighboring objects that are inconsistent with one another can serve as the basis of an effort to understand and increase the quality of spatial data sets. Elasticity should therefore be incorporated into knowledge management systems that handle spatial data.


Author(s):  
David A. Gadish

The internal validity of a spatial database can be discovered using the data contained within one or more databases. Spatial consistency includes topological consistency, or the conformance to topological rules. Discovery of inconsistencies in spatial data is an important step for improvement of spatial data quality as part of the knowledge management initiative. An approach for detecting topo-semantic inconsistencies in spatial data is presented. Inconsistencies between pairs of neighboring spatial objects are discovered by comparing relations between spatial objects to rules. A property of spatial objects, called elasticity, has been defined to measure the contribution of each of the objects to inconsistent behavior. Grouping of multiple objects, which are inconsistent with one another, based on their elasticity is proposed. The ability to discover groups of neighboring objects that are inconsistent with one another can serve as the basis of an effort to understand and increase the quality of spatial data sets. Elasticity should therefore be incorporated into knowledge management systems that handle spatial data.


Author(s):  
Gabriella Schoier

The rapid developments in the availability and access to spatially referenced information in a variety of areas, has induced the need for better analysis techniques to understand the various phenomena. In particular, spatial clustering algorithms, which group similar spatial objects into classes, can be used for the identification of areas sharing common characteristics. The aim of this chapter is to present a density based algorithm for the discovery of clusters of units in large spatial data sets (MDBSCAN). This algorithm is a modification of the DBSCAN algorithm (see Ester (1996)). The modifications regard the consideration of spatial and non spatial variables and the use of a Lagrange-Chebychev metrics instead of the usual Euclidean one. The applications concern a synthetic data set and a data set of satellite images


Data Mining ◽  
2013 ◽  
pp. 435-444
Author(s):  
Gabriella Schoier

The rapid developments in the availability and access to spatially referenced information in a variety of areas, has induced the need for better analysis techniques to understand the various phenomena. In particular, spatial clustering algorithms, which group similar spatial objects into classes, can be used for the identification of areas sharing common characteristics. The aim of this chapter is to present a density based algorithm for the discovery of clusters of units in large spatial data sets (MDBSCAN). This algorithm is a modification of the DBSCAN algorithm (see Ester (1996)). The modifications regard the consideration of spatial and non spatial variables and the use of a Lagrange-Chebychev metrics instead of the usual Euclidean one. The applications concern a synthetic data set and a data set of satellite images


2017 ◽  
Author(s):  
Marek Ślusarski ◽  

The quality of data collected in official spatial databases is crucial in making strategic decisions as well as in the implementation of planning and design works. Awareness of the level of the quality of these data is also important for individual users of official spatial data. The author presents methods and models of description and evaluation of the quality of spatial data collected in public registers. Data describing the space in the highest degree of detail, which are collected in three databases: land and buildings registry (EGiB), geodetic registry of the land infrastructure network (GESUT) and in database of topographic objects (BDOT500) were analyzed. The results of the research concerned selected aspects of activities in terms of the spatial data quality. These activities include: the assessment of the accuracy of data collected in official spatial databases; determination of the uncertainty of the area of registry parcels, analysis of the risk of damage to the underground infrastructure network due to the quality of spatial data, construction of the quality model of data collected in official databases and visualization of the phenomenon of uncertainty in spatial data. The evaluation of the accuracy of data collected in official, large-scale spatial databases was based on a representative sample of data. The test sample was a set of deviations of coordinates with three variables dX, dY and Dl – deviations from the X and Y coordinates and the length of the point offset vector of the test sample in relation to its position recognized as a faultless. The compatibility of empirical data accuracy distributions with models (theoretical distributions of random variables) was investigated and also the accuracy of the spatial data has been assessed by means of the methods resistant to the outliers. In the process of determination of the accuracy of spatial data collected in public registers, the author’s solution was used – resistant method of the relative frequency. Weight functions, which modify (to varying degree) the sizes of the vectors Dl – the lengths of the points offset vector of the test sample in relation to their position recognized as a faultless were proposed. From the scope of the uncertainty of estimation of the area of registry parcels the impact of the errors of the geodetic network points was determined (points of reference and of the higher class networks) and the effect of the correlation between the coordinates of the same point on the accuracy of the determined plot area. The scope of the correction was determined (in EGiB database) of the plots area, calculated on the basis of re-measurements, performed using equivalent techniques (in terms of accuracy). The analysis of the risk of damage to the underground infrastructure network due to the low quality of spatial data is another research topic presented in the paper. Three main factors have been identified that influence the value of this risk: incompleteness of spatial data sets and insufficient accuracy of determination of the horizontal and vertical position of underground infrastructure. A method for estimation of the project risk has been developed (quantitative and qualitative) and the author’s risk estimation technique, based on the idea of fuzzy logic was proposed. Maps (2D and 3D) of the risk of damage to the underground infrastructure network were developed in the form of large-scale thematic maps, presenting the design risk in qualitative and quantitative form. The data quality model is a set of rules used to describe the quality of these data sets. The model that has been proposed defines a standardized approach for assessing and reporting the quality of EGiB, GESUT and BDOT500 spatial data bases. Quantitative and qualitative rules (automatic, office and field) of data sets control were defined. The minimum sample size and the number of eligible nonconformities in random samples were determined. The data quality elements were described using the following descriptors: range, measure, result, and type and unit of value. Data quality studies were performed according to the users needs. The values of impact weights were determined by the hierarchical analytical process method (AHP). The harmonization of conceptual models of EGiB, GESUT and BDOT500 databases with BDOT10k database was analysed too. It was found that the downloading and supplying of the information in BDOT10k creation and update processes from the analyzed registers are limited. An effective approach to providing spatial data sets users with information concerning data uncertainty are cartographic visualization techniques. Based on the author’s own experience and research works on the quality of official spatial database data examination, the set of methods for visualization of the uncertainty of data bases EGiB, GESUT and BDOT500 was defined. This set includes visualization techniques designed to present three types of uncertainty: location, attribute values and time. Uncertainty of the position was defined (for surface, line, and point objects) using several (three to five) visual variables. Uncertainty of attribute values and time uncertainty, describing (for example) completeness or timeliness of sets, are presented by means of three graphical variables. The research problems presented in the paper are of cognitive and application importance. They indicate on the possibility of effective evaluation of the quality of spatial data collected in public registers and may be an important element of the expert system.


2020 ◽  
Vol 12 (1) ◽  
pp. 580-597
Author(s):  
Mohamad Hamzeh ◽  
Farid Karimipour

AbstractAn inevitable aspect of modern petroleum exploration is the simultaneous consideration of large, complex, and disparate spatial data sets. In this context, the present article proposes the optimized fuzzy ELECTRE (OFE) approach based on combining the artificial bee colony (ABC) optimization algorithm, fuzzy logic, and an outranking method to assess petroleum potential at the petroleum system level in a spatial framework using experts’ knowledge and the information available in the discovered petroleum accumulations simultaneously. It uses the characteristics of the essential elements of a petroleum system as key criteria. To demonstrate the approach, a case study was conducted on the Red River petroleum system of the Williston Basin. Having completed the assorted preprocessing steps, eight spatial data sets associated with the criteria were integrated using the OFE to produce a map that makes it possible to delineate the areas with the highest petroleum potential and the lowest risk for further exploratory investigations. The success and prediction rate curves were used to measure the performance of the model. Both success and prediction accuracies lie in the range of 80–90%, indicating an excellent model performance. Considering the five-class petroleum potential, the proposed approach outperforms the spatial models used in the previous studies. In addition, comparing the results of the FE and OFE indicated that the optimization of the weights by the ABC algorithm has improved accuracy by approximately 15%, namely, a relatively higher success rate and lower risk in petroleum exploration.


Sign in / Sign up

Export Citation Format

Share Document