Modelling Seismicity in California as a Spatio-Temporal Point Process Using inlabru: Insights for Earthquake Forecasting

Author(s):  
Mark Naylor ◽  
Kirsty Bayliss ◽  
Finn Lindgren ◽  
Francesco Serafini ◽  
Ian Main

<p>Many earthquake forecasting approaches have developed bespokes codes to model and forecast the spatio-temporal eveolution of seismicity. At the same time, the statistics community have been working on a range of point process modelling codes. For example, motivated by ecological applications, inlabru models spatio-temporal point processes as a log-Gaussian Cox Process and is implemented in R. Here we present an initial implementation of inlabru to model seismicity. This fully Bayesian approach is computationally efficient because it uses a nested Laplace approximation such that posteriors are assumed to be Gaussian so that their means and standard deviations can be deterministically estimated rather than having to be constructed through sampling. Further, building on existing packages in R to handle spatial data, it can construct covariate maprs from diverse data-types, such as fault maps, in an intutitive and simple manner.</p><p>Here we present an initial application to the California earthqauke catalogue to determine the relative performance of different data-sets for describing the spatio-temporal evolution of seismicity.</p>

2020 ◽  
Author(s):  
Annika Tjuka ◽  
Robert Forkel ◽  
Johann-Mattis List

Psychologists and linguists have collected a great diversity of data for word and concept properties. In psychology, many studies accumulate norms and ratings such as word frequencies or age-of-acquisition often for a large number of words. Linguistics, on the other hand, provides valuable insights into relations of word meanings. We present a collection of those data sets for norms, ratings, and relations that cover different languages: ‘NoRaRe.’ To enable a comparison between the diverse data types, we established workflows that facilitate the expansion of the database. A web application allows convenient access to the data (https://digling.org/norare/). Furthermore, a software API ensures consistent data curation by providing tests to validate the data sets. The NoRaRe collection is linked to the database curated by the Concepticon project (https://concepticon.clld.org) which offers a reference catalog of unified concept sets. The link between words in the data sets and the Concepticon concept sets makes a cross-linguistic comparison possible. In three case studies, we test the validity of our approach, the accuracy of our workflow, and the applicability of our database. The results indicate that the NoRaRe database can be applied for the study of word properties across multiple languages. The data can be used by psychologists and linguists to benefit from the knowledge rooted in both research disciplines.


2021 ◽  
Author(s):  
Valentin Buck ◽  
Flemming Stäbler ◽  
Everardo Gonzalez ◽  
Jens Greinert

<p>The study of the earth’s systems depends on a large amount of observations from homogeneous sources, which are usually scattered around time and space and are tightly intercorrelated to each other. The understanding of said systems depends on the ability to access diverse data types and contextualize them in a global setting suitable for their exploration. While the collection of environmental data has seen an enormous increase over the last couple of decades, the development of software solutions necessary to integrate observations across disciplines seems to be lagging behind. To deal with this issue, we developed the Digital Earth Viewer: a new program to access, combine, and display geospatial data from multiple sources over time.</p><p>Choosing a new approach, the software displays space in true 3D and treats time and time ranges as true dimensions. This allows users to navigate observations across spatio-temporal scales and combine data sources with each other as well as with meta-properties such as quality flags. In this way, the Digital Earth Viewer supports the generation of insight from data and the identification of observational gaps across compartments.</p><p>Developed as a hybrid application, it may be used both in-situ as a local installation to explore and contextualize new data, as well as in a hosted context to present curated data to a wider audience.</p><p>In this work, we present this software to the community, show its strengths and weaknesses, give insight into the development process and talk about extending and adapting the software to custom usecases.</p>


Author(s):  
Gebeyehu Belay Gebremeskel ◽  
Chai Yi ◽  
Zhongshi He

Data Mining (DM) is a rapidly expanding field in many disciplines, and it is greatly inspiring to analyze massive data types, which includes geospatial, image and other forms of data sets. Such the fast growths of data characterized as high volume, velocity, variety, variability, value and others that collected and generated from various sources that are too complex and big to capturing, storing, and analyzing and challenging to traditional tools. The SDM is, therefore, the process of searching and discovering valuable information and knowledge in large volumes of spatial data, which draws basic principles from concepts in databases, machine learning, statistics, pattern recognition and 'soft' computing. Using DM techniques enables a more efficient use of the data warehouse. It is thus becoming an emerging research field in Geosciences because of the increasing amount of data, which lead to new promising applications. The integral SDM in which we focused in this chapter is the inference to geospatial and GIS data.


Author(s):  
Christopher K. Wikle

The climate system consists of interactions between physical, biological, chemical, and human processes across a wide range of spatial and temporal scales. Characterizing the behavior of components of this system is crucial for scientists and decision makers. There is substantial uncertainty associated with observations of this system as well as our understanding of various system components and their interaction. Thus, inference and prediction in climate science should accommodate uncertainty in order to facilitate the decision-making process. Statistical science is designed to provide the tools to perform inference and prediction in the presence of uncertainty. In particular, the field of spatial statistics considers inference and prediction for uncertain processes that exhibit dependence in space and/or time. Traditionally, this is done descriptively through the characterization of the first two moments of the process, one expressing the mean structure and one accounting for dependence through covariability.Historically, there are three primary areas of methodological development in spatial statistics: geostatistics, which considers processes that vary continuously over space; areal or lattice processes, which considers processes that are defined on a countable discrete domain (e.g., political units); and, spatial point patterns (or point processes), which consider the locations of events in space to be a random process. All of these methods have been used in the climate sciences, but the most prominent has been the geostatistical methodology. This methodology was simultaneously discovered in geology and in meteorology and provides a way to do optimal prediction (interpolation) in space and can facilitate parameter inference for spatial data. These methods rely strongly on Gaussian process theory, which is increasingly of interest in machine learning. These methods are common in the spatial statistics literature, but much development is still being done in the area to accommodate more complex processes and “big data” applications. Newer approaches are based on restricting models to neighbor-based representations or reformulating the random spatial process in terms of a basis expansion. There are many computational and flexibility advantages to these approaches, depending on the specific implementation. Complexity is also increasingly being accommodated through the use of the hierarchical modeling paradigm, which provides a probabilistically consistent way to decompose the data, process, and parameters corresponding to the spatial or spatio-temporal process.Perhaps the biggest challenge in modern applications of spatial and spatio-temporal statistics is to develop methods that are flexible yet can account for the complex dependencies between and across processes, account for uncertainty in all aspects of the problem, and still be computationally tractable. These are daunting challenges, yet it is a very active area of research, and new solutions are constantly being developed. New methods are also being rapidly developed in the machine learning community, and these methods are increasingly more applicable to dependent processes. The interaction and cross-fertilization between the machine learning and spatial statistics community is growing, which will likely lead to a new generation of spatial statistical methods that are applicable to climate science.


2020 ◽  
Vol 45 (4) ◽  
pp. 737-763 ◽  
Author(s):  
Anirban Laha ◽  
Parag Jain ◽  
Abhijit Mishra ◽  
Karthik Sankaranarayanan

We present a framework for generating natural language description from structured data such as tables; the problem comes under the category of data-to-text natural language generation (NLG). Modern data-to-text NLG systems typically use end-to-end statistical and neural architectures that learn from a limited amount of task-specific labeled data, and therefore exhibit limited scalability, domain-adaptability, and interpretability. Unlike these systems, ours is a modular, pipeline-based approach, and does not require task-specific parallel data. Rather, it relies on monolingual corpora and basic off-the-shelf NLP tools. This makes our system more scalable and easily adaptable to newer domains. Our system utilizes a three-staged pipeline that: (i) converts entries in the structured data to canonical form, (ii) generates simple sentences for each atomic entry in the canonicalized representation, and (iii) combines the sentences to produce a coherent, fluent, and adequate paragraph description through sentence compounding and co-reference replacement modules. Experiments on a benchmark mixed-domain data set curated for paragraph description from tables reveals the superiority of our system over existing data-to-text approaches. We also demonstrate the robustness of our system in accepting other popular data sets covering diverse data types such as knowledge graphs and key-value maps.


2014 ◽  
Vol 4 (1) ◽  
pp. 38-64
Author(s):  
Nikos Pelekis ◽  
Elias Frentzos ◽  
Nikos Giatrakos ◽  
Yannis Theodoridis

Composition of space and mobility in a unified data framework results into Moving Object Databases (MOD). MOD management systems support storage and query processing of non-static spatial objects and provide essential operations for higher level analysis of movement data. The goal of this paper is to present Hermes MOD engine that supports the aforementioned functionality through appropriate data types and methods in Object-Relational DBMS (ORDBMS) environments. In particular, Hermes exploits on the extensibility interface of ORDBMS that already have extensions for static spatial data types and methods that follow the Open Geospatial Consortium (OGC) standard, and extends the ORDBMS by supporting time-varying geometries that change their position and/or extent in space and time dimensions, either discretely or continuously. It further extends the data definition and manipulation language of the ORDBMS with spatio-temporal semantics and functionality.


2019 ◽  
Author(s):  
◽  
Weichao Wu

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] A point process is a random collection of points from a certain space, and point process models are widely used in areas dealing with spatial data. However, studies of point process theory in the past only focused on Euclidean spaces, and point processes on the complex plane have been rarely explored. In this thesis we introduce and study point processes on the complex plane. We present several important quantities of a complex point process (CPP) that investigate first and second order properties of the process. We further introduce the Poisson complex point process and model its intensity function using log-linear and mixture models in the corresponding 2-dimensional space. The methods are exemplified via applications to density approximation and time series analysis via the spectral density, as well as construction and estimation of covariance functions of Gaussian random fields.


Author(s):  
Jason Soria ◽  
Ying Chen ◽  
Amanda Stathopoulos

Shared mobility-on-demand services are expanding rapidly in cities around the world. As a prominent example, app-based ridesourcing is becoming an integral part of many urban transportation ecosystems. Despite the centrality, limited public availability of detailed temporal and spatial data on ridesourcing trips has limited research on how new services interact with traditional mobility options and how they affect travel in cities. Improving data-sharing agreements are opening unprecedented opportunities for research in this area. This study examined emerging patterns of mobility using recently released City of Chicago public ridesourcing data. The detailed spatio-temporal ridesourcing data were matched with weather, transit, and taxi data to gain a deeper understanding of ridesourcing’s role in Chicago’s mobility system. The goal was to investigate the systematic variations in patronage of ridehailing. K-prototypes was utilized to detect user segments owing to its ability to accept mixed variable data types. An extension of the K-means algorithm, its output was a classification of the data into several clusters called prototypes. Six ridesourcing prototypes were identified and discussed based on significant differences in relation to adverse weather conditions, competition with alternative modes, location and timing of use, and tendency for ridesplitting. The paper discusses the implications of the identified clusters related to affordability, equity, and competition with transit.


2011 ◽  
Vol 27 (1) ◽  
pp. 47 ◽  
Author(s):  
Viktor Beneš ◽  
Blažena Frcalová

We present a stochastic model of an experimentmonitoring the spiking activity of a place cell of hippocampus of an experimental animal moving in an arena. Doubly stochastic spatio-temporal point process is used to model and quantify overdispersion. Stochastic intensity is modelled by a Lévy based random field while the animal path is simplified to a discrete random walk. In a simulation study first a method suggested previously is used. Then it is shown that a solution of the filtering problem yields the desired inference to the random intensity. Two approaches are suggested and the new one based on finite point process density is applied. Using Markov chain Monte Carlo we obtain numerical results from the simulated model. The methodology is discussed.


Sign in / Sign up

Export Citation Format

Share Document