An Ontology-driven Cyberinfrastructure for Intelligent Spatiotemporal Question Answering and Open Knowledge Discovery

The proliferation of geospatial data from diverse sources, such as Earth observation satellites, social media, and unmanned aerial vehicles (UAVs), has created a pressing demand for cross-platform data integration, interoperation, and intelligent data analysis. To address this big data challenge, this paper reports our research in developing a rule-based, semantic-enabled service chain model to support intelligent question answering for leveraging the abundant data and processing resources available online. Four key techniques were developed to achieve this goal: (1) A spatial and temporal reasoner resolves the spatial and temporal information in a given scientific question and enables place-name disambiguation based on support from a gazetteer; (2) a spatial operation ontology categorizes important spatial analysis operations, data types, and data themes, which will be used in automated chain generation; (3) a language-independent chaining rule defines the template for input, spatial operation, and output as well as rules for embedding multiple spatial operations for solving a complex problem; and (4) a recursive algorithm facilitates the generation of executive workflow metadata according to the chaining rules. We implement this service chain model in a cyberinfrastructure for online and reproducible spatial analysis and question answering. Moving the problem-solving environment from a desktop-based environment onto a geospatial cyberinfrastructure (GeoCI) offers better support to collaborative spatial decision-making and ensures science replicability. We expect this work to contribute significantly to the advancement of a reproducible spatial data science and to building the next-generation open knowledge network.

Download Full-text

Spatial Analysis

Sociology ◽

10.1093/obo/9780199756384-0058 ◽

2011 ◽

Cited By ~ 2

Author(s):

Stephen Matthews ◽

Rachel Bacon ◽

R. L’Heureux Lewis-McCoy ◽

Ellis Logan

Keyword(s):

Spatial Analysis ◽

Spatial Statistics ◽

Spatial Econometrics ◽

Spatial Data ◽

Spatial Inequality ◽

Geospatial Data ◽

Spatial Data Analysis ◽

Sociological Research ◽

Data Types ◽

Wide Range

Recent years have seen a rapid growth in interest in the addition of a spatial perspective, especially in the social and health sciences, and in part this growth has been driven by the ready availability of georeferenced or geospatial data, and the tools to analyze them: geographic information science (GIS), spatial analysis, and spatial statistics. Indeed, research on race/ethnic segregation and other forms of social stratification as well as research on human health and behavior problems, such as obesity, mental health, risk-taking behaviors, and crime, depend on the collection and analysis of individual- and contextual-level (geographic area) data across a wide range of spatial and temporal scales. Given all of these considerations, researchers are continuously developing new ways to harness and analyze geo-referenced data. Indeed, a prerequisite for spatial analysis is the availability of information on locations (i.e., places) and the attributes of those locations (e.g., poverty rates, educational attainment, religious participation, or disease prevalence). This Oxford Bibliographies article has two main parts. First, following a general overview of spatial concepts and spatial thinking in sociology, we introduce the field of spatial analysis focusing on easily available textbooks (introductory, handbooks, and advanced), journals, data, and online instructional resources. The second half of this article provides an explicit focus on spatial approaches within specific areas of sociological inquiry, including crime, demography, education, health, inequality, and religion. This section is not meant to be exhaustive but rather to indicate how some concepts, measures, data, and methods have been used by sociologists, criminologists, and demographers during their research. Throughout all sections we have attempted to introduce classic articles as well as contemporary studies. Spatial analysis is a general term to describe an array of statistical techniques that utilize locational information to better understand the pattern of observed attribute values and the processes that generated the observed pattern. The best-known early example of spatial analysis is John Snow’s 1854 cholera map of London, but the origins of spatial analysis can be traced back to France during the 1820s and 1830s and the period of morale statistique, specifically the work of Guerry, d’Angeville, Duplin, and Quetelet. The foundation for current spatial statistical analysis practice is built on methodological development in both statistics and ecology during the 1950s and quantitative geography during the 1960s and 1970s and it is a field that has been greatly enhanced by improvements in computer and information technologies relevant to the collection, and visualization and analysis of geographic or geospatial data. In the early 21st century, four main methodological approaches to spatial analysis can be identified in the literature: exploratory spatial data analysis (ESDA), spatial statistics, spatial econometrics, and geostatistics. The diversity of spatial-analytical methods available to researchers is wide and growing, which is also a function of the different types of analytical units and data types used in formal spatial analysis—specifically, point data (e.g., crime events, disease cases), line data (e.g., networks, routes), spatial continuous or field data (e.g., accessibility surfaces), and area or lattice data (e.g., unemployment and mortality rates). Applications of geospatial data and/or spatial analysis are increasingly found in sociological research, especially in studies of spatial inequality, residential segregation, demography, education, religion, neighborhoods and health, and criminology.

Download Full-text

Automated Conflation of Digital Elevation Model with Reference Hydrographic Lines

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9050334 ◽

2020 ◽

Vol 9 (5) ◽

pp. 334

Author(s):

Timofey E. Samsonov

Keyword(s):

Spatial Analysis ◽

Digital Elevation Model ◽

Spatial Data ◽

Drainage Network ◽

Map Generalization ◽

Data Types ◽

Digital Elevation ◽

Elevation Data ◽

Elevation Model ◽

Different Sources

Combining misaligned spatial data from different sources complicates spatial analysis and creation of maps. Conflation is a process that solves the misalignment problem through spatial adjustment or attribute transfer between similar features in two datasets. Even though a combination of digital elevation model (DEM) and vector hydrographic lines is a common practice in spatial analysis and mapping, no method for automated conflation between these spatial data types has been developed so far. The problem of DEM and hydrography misalignment arises not only in map compilation, but also during the production of generalized datasets. There is a lack of automated solutions which can ensure that the drainage network represented in the surface of generalized DEM is spatially adjusted with independently generalized vector hydrography. We propose a new method that performs the conflation of DEM with linear hydrographic data and is embeddable into DEM generalization process. Given a set of reference hydrographic lines, our method automatically recognizes the most similar paths on DEM surface called counterpart streams. The elevation data extracted from DEM is then rubbersheeted locally using the links between counterpart streams and reference lines, and the conflated DEM is reconstructed from the rubbersheeted elevation data. The algorithm developed for extraction of counterpart streams ensures that the resulting set of lines comprises the network similar to the network of ordered reference lines. We also show how our approach can be seamlessly integrated into a TIN-based structural DEM generalization process with spatial adjustment to pre-generalized hydrographic lines as additional requirement. The combination of the GEBCO_2019 DEM and the Natural Earth 10M vector dataset is used to illustrate the effectiveness of DEM conflation both in map compilation and map generalization workflows. Resulting maps are geographically correct and are aesthetically more pleasing in comparison to a straightforward combination of misaligned DEM and hydrographic lines without conflation.

Download Full-text

Formal verification technique for grid service chain model and its application

Science in China Series F Information Sciences ◽

10.1007/s11432-007-0006-9 ◽

2007 ◽

Vol 50 (1) ◽

pp. 1-20 ◽

Cited By ~ 5

Author(s):

Ke Xu ◽

YueXuan Wang ◽

Cheng Wu

Keyword(s):

Formal Verification ◽

Grid Service ◽

Chain Model ◽

Service Chain

Download Full-text

Product formalisms for measures on spaces with binary tree structures: representation, visualization, and multiscale noise

Forum of Mathematics Sigma ◽

10.1017/fms.2020.40 ◽

2020 ◽

Vol 8 ◽

Author(s):

Devasis Bassu ◽

Peter W. Jones ◽

Linda Ness ◽

David Shallcross

Keyword(s):

Data Science ◽

Theoretical Foundation ◽

Continuous Functions ◽

Noise Model ◽

Data Types ◽

Data Set ◽

Tree Structures ◽

Simple Concept ◽

Wide Range ◽

Computer Scientists

Abstract In this paper, we present a theoretical foundation for a representation of a data set as a measure in a very large hierarchically parametrized family of positive measures, whose parameters can be computed explicitly (rather than estimated by optimization), and illustrate its applicability to a wide range of data types. The preprocessing step then consists of representing data sets as simple measures. The theoretical foundation consists of a dyadic product formula representation lemma, and a visualization theorem. We also define an additive multiscale noise model that can be used to sample from dyadic measures and a more general multiplicative multiscale noise model that can be used to perturb continuous functions, Borel measures, and dyadic measures. The first two results are based on theorems in [15, 3, 1]. The representation uses the very simple concept of a dyadic tree and hence is widely applicable, easily understood, and easily computed. Since the data sample is represented as a measure, subsequent analysis can exploit statistical and measure theoretic concepts and theories. Because the representation uses the very simple concept of a dyadic tree defined on the universe of a data set, and the parameters are simply and explicitly computable and easily interpretable and visualizable, we hope that this approach will be broadly useful to mathematicians, statisticians, and computer scientists who are intrigued by or involved in data science, including its mathematical foundations.

Download Full-text

Open Code is not enough: Towards a replicable future for geographic data science

10.31235/osf.io/3hbnt ◽

2019 ◽

Author(s):

Levi John Wolf ◽

Sergio J. Rey ◽

Taylor M. Oshan

Keyword(s):

Spatial Data ◽

Data Science ◽

Current Model ◽

Open Science ◽

Social Changes ◽

Working Definition ◽

Geospatial Cyberinfrastructure ◽

Geographic Data ◽

Definition Of ◽

Healthy Part

Open science practices are a large and healthy part of computational geography and the burgeoning field of spatial data science. In many forms, open geospatial cyberinfrastructure adheres to a varying and informal set of practices and codes that empower levels of collaboration that are impossible otherwise. Pathbreaking work in geographical sciences has explicitly brought these concepts into focus for our current model of open science in geography. In practice, however, these blend together into a somewhat ill-advised but easy-to-use working definition of open science: you know open science when you see it (on GitHub). However, open science lags far behind the needs revealed by this level of collaboration. In this paper, we describe the concerns of open geographic data science, in terms of replicability and open science. We discuss the practical techniques that engender community-building in open science communities, and discuss the impacts that these kinds of social changes have on the technological architecture of scientific infrastructure.

Download Full-text

A Spatial Analysis Approach to Understanding Caddoan Mounds in the Arkansas River Drainage

Index of Texas Archaeology Open Access Grey Literature from the Lone Star State ◽

10.21112/.ita.2004.1.23 ◽

2004 ◽

Author(s):

Gregory Vogel

Keyword(s):

Spatial Analysis ◽

Spatial Data ◽

Large Scale ◽

Theoretical Background ◽

Gis Analysis ◽

Arkansas River ◽

Gis Modeling ◽

River Drainage ◽

The People ◽

Computationally Intensive

In this article I present a theoretical framework for understanding Caddoan mounds in the central Arkansas River drainage and the implications they may hold for the social structure and environmental adaptations of the people who made them. The power and efficiency of Geographic Information Systems (GIS) modeling now allows for large-scale, computationally intensive spatial analysis simply not possible before. Questions of landscape organization or spatial relationships that previously would have taken months or even years to answer can now be solved in a matter of minutes with GIS and related technologies, given the appropriate datasets. Quite importantly, though, such analyses must first be placed in context and theory if they are to be meaningful additions to our understanding of the past. While it is conventional to refer to “GIS analysis” (and I use the term in this article), it is important to keep in mind that data manipulations alone are not analysis. GIS, along with statistical software and related computer technologies, are tools of spatial analysis just as shovels and trowels are tools of excavation. Such tools can organize and reveal information if they are employed carefully, but the tools themselves have no agency and cannot interpret anything on their own. The terms “GIS analysis” or “GIS interpretation” are therefore somewhat misnomers, just as “trowel analysis” or “trowel interpretation” would be. It is not the GIS, or any component of it, that does the analysis or interpretation; it simply manipulates spatial data. We interpret these manipulations based upon theoretical background, previous research, and the questions we wish to answer.

Download Full-text

Integrating cellular automata and discrete global grid systems: a case study into wildfire modelling

AGILE: GIScience Series ◽

10.5194/agile-giss-1-6-2020 ◽

2020 ◽

Vol 1 ◽

pp. 1-23

Author(s):

Majid Hojati ◽

Colin Robertson

Keyword(s):

Big Data ◽

Spatial Analysis ◽

Cellular Automata ◽

Spatial Data ◽

Data Model ◽

Data Access ◽

Environmental Modeling ◽

Modeling Framework ◽

Global Grid

Abstract. With new forms of digital spatial data driving new applications for monitoring and understanding environmental change, there are growing demands on traditional GIS tools for spatial data storage, management and processing. Discrete Global Grid System (DGGS) are methods to tessellate globe into multiresolution grids, which represent a global spatial fabric capable of storing heterogeneous spatial data, and improved performance in data access, retrieval, and analysis. While DGGS-based GIS may hold potential for next-generation big data GIS platforms, few of studies have tried to implement them as a framework for operational spatial analysis. Cellular Automata (CA) is a classic dynamic modeling framework which has been used with traditional raster data model for various environmental modeling such as wildfire modeling, urban expansion modeling and so on. The main objectives of this paper are to (i) investigate the possibility of using DGGS for running dynamic spatial analysis, (ii) evaluate CA as a generic data model for dynamic phenomena modeling within a DGGS data model and (iii) evaluate an in-database approach for CA modelling. To do so, a case study into wildfire spread modelling is developed. Results demonstrate that using a DGGS data model not only provides the ability to integrate different data sources, but also provides a framework to do spatial analysis without using geometry-based analysis. This results in a simplified architecture and common spatial fabric to support development of a wide array of spatial algorithms. While considerable work remains to be done, CA modelling within a DGGS-based GIS is a robust and flexible modelling framework for big-data GIS analysis in an environmental monitoring context.

Download Full-text

Spatial Data Analysis in epidemiology of sensorineural hearing loss

Russian otorhinolaryngology ◽

10.18692/1810-4800-2020-4-13-20 ◽

2020 ◽

Vol 19 (4) ◽

pp. 13-20

Author(s):

A. A. Korneenkov ◽

◽

S. V. Ryazantsev ◽

I. V. Fanta ◽

E. E. Vyazemskaya ◽

...

Keyword(s):

Hearing Loss ◽

Information Systems ◽

Spatial Analysis ◽

Sensorineural Hearing Loss ◽

Spatial Data ◽

Sensorineural Hearing ◽

Spatial Data Analysis ◽

Point Pattern ◽

Spatial Process ◽

One Year

The identification of risk factors, features and patterns of the emergence and spread of diseases in space requires a large array of diverse data and the use of a serious mathematical and statistical apparatus. The distribution of diseases in space is studied using spatial analysis tools, which are now widely used as information systems are introduced and data are accumulated that are relevant to public health. For most tasks of working with spatial data (data, events that have geographical, spatial coordinates), various geographic information systems are used. As a disease for spatial analysis, sensorineural hearing loss was chosen, with which patients were treated at the Saint-Petersburg Research of Ear, Throat, Nose and Speech during one year of the study. The main tasks of the spatial analysis of data on the incidence of sensorineural hearing loss (SNHL) for hospitalization were: visualization of a point pattern, which can form the geographical coordinates of the places of residence of inpatients with SNHL; assessment of the properties of the spatial process that generates this point image (assessment of the intensity of the process, its laws) using various statistical indicators; testing the hypothesis about the spatial randomness of this process and the influence of individual factors on it. R-code accompanied all calculations in the article. Calculations can be reproduced quite easily. The text of the article can be used as step-by-step instructions for their implementation.

Download Full-text