Spatial Statistics

Author(s):  
Christopher K. Wikle

The climate system consists of interactions between physical, biological, chemical, and human processes across a wide range of spatial and temporal scales. Characterizing the behavior of components of this system is crucial for scientists and decision makers. There is substantial uncertainty associated with observations of this system as well as our understanding of various system components and their interaction. Thus, inference and prediction in climate science should accommodate uncertainty in order to facilitate the decision-making process. Statistical science is designed to provide the tools to perform inference and prediction in the presence of uncertainty. In particular, the field of spatial statistics considers inference and prediction for uncertain processes that exhibit dependence in space and/or time. Traditionally, this is done descriptively through the characterization of the first two moments of the process, one expressing the mean structure and one accounting for dependence through covariability.Historically, there are three primary areas of methodological development in spatial statistics: geostatistics, which considers processes that vary continuously over space; areal or lattice processes, which considers processes that are defined on a countable discrete domain (e.g., political units); and, spatial point patterns (or point processes), which consider the locations of events in space to be a random process. All of these methods have been used in the climate sciences, but the most prominent has been the geostatistical methodology. This methodology was simultaneously discovered in geology and in meteorology and provides a way to do optimal prediction (interpolation) in space and can facilitate parameter inference for spatial data. These methods rely strongly on Gaussian process theory, which is increasingly of interest in machine learning. These methods are common in the spatial statistics literature, but much development is still being done in the area to accommodate more complex processes and “big data” applications. Newer approaches are based on restricting models to neighbor-based representations or reformulating the random spatial process in terms of a basis expansion. There are many computational and flexibility advantages to these approaches, depending on the specific implementation. Complexity is also increasingly being accommodated through the use of the hierarchical modeling paradigm, which provides a probabilistically consistent way to decompose the data, process, and parameters corresponding to the spatial or spatio-temporal process.Perhaps the biggest challenge in modern applications of spatial and spatio-temporal statistics is to develop methods that are flexible yet can account for the complex dependencies between and across processes, account for uncertainty in all aspects of the problem, and still be computationally tractable. These are daunting challenges, yet it is a very active area of research, and new solutions are constantly being developed. New methods are also being rapidly developed in the machine learning community, and these methods are increasingly more applicable to dependent processes. The interaction and cross-fertilization between the machine learning and spatial statistics community is growing, which will likely lead to a new generation of spatial statistical methods that are applicable to climate science.

2022 ◽  
Author(s):  
Md Mahbub Alam ◽  
Luis Torgo ◽  
Albert Bifet

Due to the surge of spatio-temporal data volume, the popularity of location-based services and applications, and the importance of extracted knowledge from spatio-temporal data to solve a wide range of real-world problems, a plethora of research and development work has been done in the area of spatial and spatio-temporal data analytics in the past decade. The main goal of existing works was to develop algorithms and technologies to capture, store, manage, analyze, and visualize spatial or spatio-temporal data. The researchers have contributed either by adding spatio-temporal support with existing systems, by developing a new system from scratch, or by implementing algorithms for processing spatio-temporal data. The existing ecosystem of spatial and spatio-temporal data analytics systems can be categorized into three groups, (1) spatial databases (SQL and NoSQL), (2) big spatial data processing infrastructures, and (3) programming languages and GIS software. Since existing surveys mostly investigated infrastructures for processing big spatial data, this survey has explored the whole ecosystem of spatial and spatio-temporal analytics. This survey also portrays the importance and future of spatial and spatio-temporal data analytics.


Author(s):  
Stephen Matthews ◽  
Rachel Bacon ◽  
R. L’Heureux Lewis-McCoy ◽  
Ellis Logan

Recent years have seen a rapid growth in interest in the addition of a spatial perspective, especially in the social and health sciences, and in part this growth has been driven by the ready availability of georeferenced or geospatial data, and the tools to analyze them: geographic information science (GIS), spatial analysis, and spatial statistics. Indeed, research on race/ethnic segregation and other forms of social stratification as well as research on human health and behavior problems, such as obesity, mental health, risk-taking behaviors, and crime, depend on the collection and analysis of individual- and contextual-level (geographic area) data across a wide range of spatial and temporal scales. Given all of these considerations, researchers are continuously developing new ways to harness and analyze geo-referenced data. Indeed, a prerequisite for spatial analysis is the availability of information on locations (i.e., places) and the attributes of those locations (e.g., poverty rates, educational attainment, religious participation, or disease prevalence). This Oxford Bibliographies article has two main parts. First, following a general overview of spatial concepts and spatial thinking in sociology, we introduce the field of spatial analysis focusing on easily available textbooks (introductory, handbooks, and advanced), journals, data, and online instructional resources. The second half of this article provides an explicit focus on spatial approaches within specific areas of sociological inquiry, including crime, demography, education, health, inequality, and religion. This section is not meant to be exhaustive but rather to indicate how some concepts, measures, data, and methods have been used by sociologists, criminologists, and demographers during their research. Throughout all sections we have attempted to introduce classic articles as well as contemporary studies. Spatial analysis is a general term to describe an array of statistical techniques that utilize locational information to better understand the pattern of observed attribute values and the processes that generated the observed pattern. The best-known early example of spatial analysis is John Snow’s 1854 cholera map of London, but the origins of spatial analysis can be traced back to France during the 1820s and 1830s and the period of morale statistique, specifically the work of Guerry, d’Angeville, Duplin, and Quetelet. The foundation for current spatial statistical analysis practice is built on methodological development in both statistics and ecology during the 1950s and quantitative geography during the 1960s and 1970s and it is a field that has been greatly enhanced by improvements in computer and information technologies relevant to the collection, and visualization and analysis of geographic or geospatial data. In the early 21st century, four main methodological approaches to spatial analysis can be identified in the literature: exploratory spatial data analysis (ESDA), spatial statistics, spatial econometrics, and geostatistics. The diversity of spatial-analytical methods available to researchers is wide and growing, which is also a function of the different types of analytical units and data types used in formal spatial analysis—specifically, point data (e.g., crime events, disease cases), line data (e.g., networks, routes), spatial continuous or field data (e.g., accessibility surfaces), and area or lattice data (e.g., unemployment and mortality rates). Applications of geospatial data and/or spatial analysis are increasingly found in sociological research, especially in studies of spatial inequality, residential segregation, demography, education, religion, neighborhoods and health, and criminology.


2020 ◽  
Author(s):  
Mark Naylor ◽  
Kirsty Bayliss ◽  
Finn Lindgren ◽  
Francesco Serafini ◽  
Ian Main

<p>Many earthquake forecasting approaches have developed bespokes codes to model and forecast the spatio-temporal eveolution of seismicity. At the same time, the statistics community have been working on a range of point process modelling codes. For example, motivated by ecological applications, inlabru models spatio-temporal point processes as a log-Gaussian Cox Process and is implemented in R. Here we present an initial implementation of inlabru to model seismicity. This fully Bayesian approach is computationally efficient because it uses a nested Laplace approximation such that posteriors are assumed to be Gaussian so that their means and standard deviations can be deterministically estimated rather than having to be constructed through sampling. Further, building on existing packages in R to handle spatial data, it can construct covariate maprs from diverse data-types, such as fault maps, in an intutitive and simple manner.</p><p>Here we present an initial application to the California earthqauke catalogue to determine the relative performance of different data-sets for describing the spatio-temporal evolution of seismicity.</p>


Author(s):  
Noel Cressie ◽  
Matthew Sainsbury-Dale ◽  
Andrew Zammit-Mangion

Spatial statistics is concerned with the analysis of data that have spatial locations associated with them, and those locations are used to model statistical dependence between the data. The spatial data are treated as a single realization from a probability model that encodes the dependence through both fixed effects and random effects, where randomness is manifest in the underlying spatial process and in the noisy, incomplete measurement process. The focus of this review article is on the use of basis functions to provide an extremely flexible and computationally efficient way to model spatial processes that are possibly highly nonstationary. Several examples of basis-function models are provided to illustrate how they are used in Gaussian, non-Gaussian, multivariate, and spatio-temporal settings, with applications in geophysics. Our aim is to emphasize the versatility of these spatial-statistical models and to demonstrate that they are now center-stage in a number of application domains. The review concludes with a discussion and illustration of software currently available to fit spatial-basis-function models and implement spatial-statistical prediction. Expected final online publication date for the Annual Review of Statistics and Its Application, Volume 9 is March 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2018 ◽  
Author(s):  
Sherif Tawfik ◽  
Olexandr Isayev ◽  
Catherine Stampfl ◽  
Joseph Shapter ◽  
David Winkler ◽  
...  

Materials constructed from different van der Waals two-dimensional (2D) heterostructures offer a wide range of benefits, but these systems have been little studied because of their experimental and computational complextiy, and because of the very large number of possible combinations of 2D building blocks. The simulation of the interface between two different 2D materials is computationally challenging due to the lattice mismatch problem, which sometimes necessitates the creation of very large simulation cells for performing density-functional theory (DFT) calculations. Here we use a combination of DFT, linear regression and machine learning techniques in order to rapidly determine the interlayer distance between two different 2D heterostructures that are stacked in a bilayer heterostructure, as well as the band gap of the bilayer. Our work provides an excellent proof of concept by quickly and accurately predicting a structural property (the interlayer distance) and an electronic property (the band gap) for a large number of hybrid 2D materials. This work paves the way for rapid computational screening of the vast parameter space of van der Waals heterostructures to identify new hybrid materials with useful and interesting properties.


2020 ◽  
Author(s):  
Sina Faizollahzadeh Ardabili ◽  
Amir Mosavi ◽  
Pedram Ghamisi ◽  
Filip Ferdinand ◽  
Annamaria R. Varkonyi-Koczy ◽  
...  

Several outbreak prediction models for COVID-19 are being used by officials around the world to make informed-decisions and enforce relevant control measures. Among the standard models for COVID-19 global pandemic prediction, simple epidemiological and statistical models have received more attention by authorities, and they are popular in the media. Due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. Although the literature includes several attempts to address this issue, the essential generalization and robustness abilities of existing models needs to be improved. This paper presents a comparative analysis of machine learning and soft computing models to predict the COVID-19 outbreak as an alternative to SIR and SEIR models. Among a wide range of machine learning models investigated, two models showed promising results (i.e., multi-layered perceptron, MLP, and adaptive network-based fuzzy inference system, ANFIS). Based on the results reported here, and due to the highly complex nature of the COVID-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. This paper provides an initial benchmarking to demonstrate the potential of machine learning for future research. Paper further suggests that real novelty in outbreak prediction can be realized through integrating machine learning and SEIR models.


2021 ◽  
Vol 15 ◽  
Author(s):  
Alhassan Alkuhlani ◽  
Walaa Gad ◽  
Mohamed Roushdy ◽  
Abdel-Badeeh M. Salem

Background: Glycosylation is one of the most common post-translation modifications (PTMs) in organism cells. It plays important roles in several biological processes including cell-cell interaction, protein folding, antigen’s recognition, and immune response. In addition, glycosylation is associated with many human diseases such as cancer, diabetes and coronaviruses. The experimental techniques for identifying glycosylation sites are time-consuming, extensive laboratory work, and expensive. Therefore, computational intelligence techniques are becoming very important for glycosylation site prediction. Objective: This paper is a theoretical discussion of the technical aspects of the biotechnological (e.g., using artificial intelligence and machine learning) to digital bioinformatics research and intelligent biocomputing. The computational intelligent techniques have shown efficient results for predicting N-linked, O-linked and C-linked glycosylation sites. In the last two decades, many studies have been conducted for glycosylation site prediction using these techniques. In this paper, we analyze and compare a wide range of intelligent techniques of these studies from multiple aspects. The current challenges and difficulties facing the software developers and knowledge engineers for predicting glycosylation sites are also included. Method: The comparison between these different studies is introduced including many criteria such as databases, feature extraction and selection, machine learning classification methods, evaluation measures and the performance results. Results and conclusions: Many challenges and problems are presented. Consequently, more efforts are needed to get more accurate prediction models for the three basic types of glycosylation sites.


2019 ◽  
Vol 942 (12) ◽  
pp. 41-49
Author(s):  
A.M. Portnov

Using unified principles of formation and maintenance of register/cadaster with information about spatial data of landscape objects as the informational and technological basis for updating the public topographic maps and modernization of state cartographic system is proposed. The problems of informational relevancy of unified electronical cartographic basis and capacity of its renovation in case of public cadaster map data. The need to modernize the system of classification and coding of cartographic information, the use of unified standards for the coordinate description of register objects for their topological consistency, verification and updating is emphasized. Implementing such solutions is determined by economical expediency as well as necessity of providing a variety of real thematic data for wide range of consumers in the field of urban planning, territories development and completing the tasks of Governmental program “Digital economy of the Russian Federation”.


Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 598
Author(s):  
Massimiliano Pau ◽  
Bruno Leban ◽  
Michela Deidda ◽  
Federica Putzolu ◽  
Micaela Porta ◽  
...  

The majority of people with Multiple Sclerosis (pwMS), report lower limb motor dysfunctions, which may relevantly affect postural control, gait and a wide range of activities of daily living. While it is quite common to observe a different impact of the disease on the two limbs (i.e., one of them is more affected), less clear are the effects of such asymmetry on gait performance. The present retrospective cross-sectional study aimed to characterize the magnitude of interlimb asymmetry in pwMS, particularly as regards the joint kinematics, using parameters derived from angle-angle diagrams. To this end, we analyzed gait patterns of 101 pwMS (55 women, 46 men, mean age 46.3, average Expanded Disability Status Scale (EDSS) score 3.5, range 1–6.5) and 81 unaffected individuals age- and sex-matched who underwent 3D computerized gait analysis carried out using an eight-camera motion capture system. Spatio-temporal parameters and kinematics in the sagittal plane at hip, knee and ankle joints were considered for the analysis. The angular trends of left and right sides were processed to build synchronized angle–angle diagrams (cyclograms) for each joint, and symmetry was assessed by computing several geometrical features such as area, orientation and Trend Symmetry. Based on cyclogram orientation and Trend Symmetry, the results show that pwMS exhibit significantly greater asymmetry in all three joints with respect to unaffected individuals. In particular, orientation values were as follows: 5.1 of pwMS vs. 1.6 of unaffected individuals at hip joint, 7.0 vs. 1.5 at knee and 6.4 vs. 3.0 at ankle (p < 0.001 in all cases), while for Trend Symmetry we obtained at hip 1.7 of pwMS vs. 0.3 of unaffected individuals, 4.2 vs. 0.5 at knee and 8.5 vs. 1.5 at ankle (p < 0.001 in all cases). Moreover, the same parameters were sensitive enough to discriminate individuals of different disability levels. With few exceptions, all the calculated symmetry parameters were found significantly correlated with the main spatio-temporal parameters of gait and the EDSS score. In particular, large correlations were detected between Trend Symmetry and gait speed (with rho values in the range of –0.58 to –0.63 depending on the considered joint, p < 0.001) and between Trend Symmetry and EDSS score (rho = 0.62 to 0.69, p < 0.001). Such results suggest not only that MS is associated with significantly marked interlimb asymmetry during gait but also that such asymmetry worsens as the disease progresses and that it has a relevant impact on gait performances.


Sign in / Sign up

Export Citation Format

Share Document