Anomaly behaviour detection based on the meta-Morisita index for large scale spatio-temporal data set

Abstract. Sandy coasts are constantly changing environments governed by complex interacting processes. Permanent laser scanning is a promising technique to monitor such coastal areas and support analysis of geomorphological deformation processes. This novel technique delivers 3D representations of a part of the coast at hourly temporal and centimetre spatial resolution and allows to observe small scale changes in elevation over extended periods of time. These observations have the potential to improve understanding and modelling of coastal deformation processes. However, to be of use to coastal researchers and coastal management, an efficient way to find and extract deformation processes from the large spatio-temporal data set is needed. In order to allow data mining in an automated way, we extract time series in elevation or range and use unsupervised learning algorithms to derive a partitioning of the observed area according to change patterns. We compare three well known clustering algorithms, k-means, agglomerative clustering and DBSCAN, and identify areas that undergo similar evolution during one month. We test if they fulfil our criteria for a suitable clustering algorithm on our exemplary data set. The three clustering methods are applied to time series of 30 epochs (during one month) extracted from a data set of daily scans covering a part of the coast at Kijkduin, the Netherlands. A small section of the beach, where a pile of sand was accumulated by a bulldozer is used to evaluate the performance of the algorithms against a ground truth. The k-means algorithm and agglomerative clustering deliver similar clusters, and both allow to identify a fixed number of dominant deformation processes in sandy coastal areas, such as sand accumulation by a bulldozer or erosion in the intertidal area. The DBSCAN algorithm finds clusters for only about 44 % of the area and turns out to be more suitable for the detection of outliers, caused for example by temporary objects on the beach. Our study provides a methodology to efficiently mine a spatio-temporal data set for predominant deformation patterns with the associated regions, where they occur.

Download Full-text

Dynamic spatio-temporal generation of large-scale synthetic gridded precipitation: with improved spatial coherence of extremes

Stochastic Environmental Research and Risk Assessment ◽

10.1007/s00477-019-01724-9 ◽

2019 ◽

Vol 34 (9) ◽

pp. 1369-1383 ◽

Cited By ~ 1

Author(s):

Dirk Diederen ◽

Ye Liu

Keyword(s):

Large Scale ◽

Spatial Coherence ◽

Original Data ◽

Return Level ◽

Data Sets ◽

Large Set ◽

Precipitation Data ◽

Data Set ◽

Spatio Temporal ◽

Synthetic Precipitation

Abstract With the ongoing development of distributed hydrological models, flood risk analysis calls for synthetic, gridded precipitation data sets. The availability of large, coherent, gridded re-analysis data sets in combination with the increase in computational power, accommodates the development of new methodology to generate such synthetic data. We tracked moving precipitation fields and classified them using self-organising maps. For each class, we fitted a multivariate mixture model and generated a large set of synthetic, coherent descriptors, which we used to reconstruct moving synthetic precipitation fields. We introduced randomness in the original data set by replacing the observed precipitation fields in the original data set with the synthetic precipitation fields. The output is a continuous, gridded, hourly precipitation data set of a much longer duration, containing physically plausible and spatio-temporally coherent precipitation events. The proposed methodology implicitly provides an important improvement in the spatial coherence of precipitation extremes. We investigate the issue of unrealistic, sudden changes on the grid and demonstrate how a dynamic spatio-temporal generator can provide spatial smoothness in the probability distribution parameters and hence in the return level estimates.

Download Full-text

Air quality based optimal path search model for spatio-temporal data set

2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI) ◽

10.1109/icacci.2013.6637149 ◽

2013 ◽

Author(s):

Komathy Karuppanan ◽

Adhirai Manickam ◽

Elakya Karthikeyan ◽

Monica Narayanan

Keyword(s):

Air Quality ◽

Optimal Path ◽

Search Model ◽

Temporal Data ◽

Data Set ◽

Path Search ◽

Spatio Temporal

Download Full-text

Using treemaps for variable selection in spatio-temporal visualisation

Information Visualization ◽

10.1057/palgrave.ivs.9500185 ◽

2008 ◽

Vol 7 (3-4) ◽

pp. 210-224 ◽

Cited By ~ 23

Author(s):

Aidan Slingsby ◽

Jason Dykes ◽

Jo Wood

Keyword(s):

Variable Selection ◽

Visual Analysis ◽

Temporal Distribution ◽

Temporal Data ◽

Data Set ◽

Raster Maps ◽

Categorical Attributes ◽

Interactive Interfaces ◽

Road Maps ◽

Spatio Temporal

We demonstrate and reflect upon the use of enhanced treemaps that incorporate spatial and temporal ordering for exploring a large multivariate spatio-temporal data set. The resulting data-dense views summarise and simultaneously present hundreds of space-, time-, and variable-constrained subsets of a large multivariate data set in a structure that facilitates their meaningful comparison and supports visual analysis. Interactive techniques allow localised patterns to be explored and subsets of interest selected and compared with the spatial aggregate. Spatial variation is considered through interactive raster maps and high-resolution local road maps. The techniques are developed in the context of 42.2 million records of vehicular activity in a 98 km2 area of central London and informally evaluated through a design used in the exploratory visualisation of this data set. The main advantages of our technique are the means to simultaneously display hundreds of summaries of the data and to interactively browse hundreds of variable combinations with ordering and symbolism that are consistent and appropriate for space- and time-based variables. These capabilities are difficult to achieve in the case of spatio-temporal data with categorical attributes using existing geovisualisation methods. We acknowledge limitations in the treemap representation but enhance the cognitive plausibility of this popular layout through our two-dimensional ordering algorithm and interactions. Patterns that are expected (e.g. more traffic in central London), interesting (e.g. the spatial and temporal distribution of particular vehicle types) and anomalous (e.g. low speeds on particular road sections) are detected at various scales and locations using the approach. In many cases, anomalies identify biases that may have implications for future use of the data set for analyses and applications. Ordered treemaps appear to have potential as interactive interfaces for variable selection in spatio-temporal visualisation.

Download Full-text

[Foreword] Welcome to the Special Section on Large-Scale Spatio-Temporal Data Analysis for Real-World Sensing

ITE Transactions on Media Technology and Applications ◽

10.3169/mta.3.155 ◽

2015 ◽

Vol 3 (3) ◽

pp. 155-155

Author(s):

Noboru Babaguchi

Keyword(s):

Data Analysis ◽

Real World ◽

Large Scale ◽

Special Section ◽

Temporal Data ◽

Spatio Temporal ◽

Temporal Data Analysis

Download Full-text

Development of Large-scale Spatio-temporal Data Join Query Processing Algorithm on Spark

The Journal of Next-generation Convergence Technology Association ◽

10.33097/jncta.2021.05.04.516 ◽

2021 ◽

Vol 5 (4) ◽

pp. 516-522

Author(s):

Yong-Ki Kim ◽

Keyword(s):

Query Processing ◽

Large Scale ◽

Processing Algorithm ◽

Temporal Data ◽

Spatio Temporal

Download Full-text

A Display Method for Large-Scale Transmission and Transformation Equipment Load Capacity Spatio-temporal Data

Journal of Physics Conference Series ◽

10.1088/1742-6596/1098/1/012010 ◽

2018 ◽

Vol 1098 ◽

pp. 012010

Author(s):

Qin Jiafeng ◽

Zhou Chao ◽

Lin Ying ◽

Wang Huajia ◽

Yang Feng

Keyword(s):

Large Scale ◽

Load Capacity ◽

Temporal Data ◽

Spatio Temporal

Download Full-text

Dataset Reduction Techniques to Speed Up SVD Analyses on Big Geo-Datasets

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8020055 ◽

2019 ◽

Vol 8 (2) ◽

pp. 55 ◽

Cited By ~ 2

Author(s):

Laurens Bogaardt ◽

Romulo Goncalves ◽

Raul Zurita-Milla ◽

Emma Izquierdo-Verdiguier

Keyword(s):

Random Fields ◽

Large Scale ◽

Gaussian Random Fields ◽

Large Datasets ◽

Temporal Data ◽

Close Approximation ◽

Reduction Techniques ◽

Speed Up ◽

Spatio Temporal ◽

Value Decomposition

The Singular Value Decomposition (SVD) is a mathematical procedure with multiple applications in the geosciences. For instance, it is used in dimensionality reduction and as a support operator for various analytical tasks applicable to spatio-temporal data. Performing SVD analyses on large datasets, however, can be computationally costly, time consuming, and sometimes practically infeasible. However, techniques exist to arrive at the same output, or at a close approximation, which requires far less effort. This article examines several such techniques in relation to the inherent scale of the structure within the data. When the values of a dataset vary slowly, e.g., in a spatial field of temperature over a country, there is autocorrelation and the field contains large scale structure. Datasets do not need a high resolution to describe such fields and their analysis can benefit from alternative SVD techniques based on rank deficiency, coarsening, or matrix factorization approaches. We use both simulated Gaussian Random Fields with various levels of autocorrelation and real-world geospatial datasets to illustrate our study while examining the accuracy of various SVD techniques. As the main result, this article provides researchers with a decision tree indicating which technique to use when and predicting the resulting level of accuracy based on the dataset’s structure scale.

Download Full-text

A parameter-level parallel optimization algorithm for large-scale spatio-temporal data mining

Distributed and Parallel Databases ◽

10.1007/s10619-020-07287-x ◽

2020 ◽

Vol 38 (3) ◽

pp. 739-765 ◽

Cited By ~ 1

Author(s):

Zhiqiang Liu ◽

Xuanhua Shi ◽

Ligang He ◽

Dongxiao Yu ◽

Hai Jin ◽

...

Keyword(s):

Data Mining ◽

Optimization Algorithm ◽

Large Scale ◽

Parallel Optimization ◽

Temporal Data Mining ◽

Temporal Data ◽

Spatio Temporal

Download Full-text

STANDARDS-BASED SERVICES FOR BIG SPATIO-TEMPORAL DATA

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xli-b4-691-2016 ◽

2016 ◽

Vol XLI-B4 ◽

pp. 691-699

Author(s):

P. Baumann ◽

V. Merticariu ◽

A. Dumitru ◽

D. Misev

Keyword(s):

Data Model ◽

Large Scale ◽

Point Clouds ◽

Current Status ◽

Temporal Data ◽

Large Scale Data ◽

Spatio Temporal ◽

High Level ◽

Immense Potential ◽

Scale Data

With the unprecedented availability of continuously updated measured and generated data there is an immense potential for getting new and timely insights – yet, the value is not fully leveraged as of today. The quest is up for high-level service interfaces for dissecting datasets and rejoining them with other datasets – ultimately, to allow users to ask "any question, anytime, on any size" enabling them to "build their own product on the go". <br><br> With OGC Coverages, a concrete, interoperable data model has been established which unifies n-D spatio-temporal regular and irregular grids, point clouds, and meshes. The Web Coverage Service (WCS) suite provides versatile streamlined coverage functionality ranging from simple access to flexible spatio-temporal analytics. Flexibility and scalability of the WCS suite has been demonstrated in practice through massive services run by large-scale data centers. We present the current status in OGC Coverage data and service models, contrast them to related work, and describe a scalable implementation based on the rasdaman array engine.

Download Full-text