scholarly journals Statistical Analysis of Discrete-valued Time Series by Parsimonious High-order Markov Chains

2020 ◽  
Vol 49 (4) ◽  
pp. 76-88
Author(s):  
Yuriy Kharin

Problems of statistical analysis of discrete-valued time series are considered. Two approaches for construction of parsimonious (small-parametric) models for observed discrete data are proposed based on high-order Markov chains.Consistent statistical estimators for parameters of the developed models and some known models, and also statistical tests on the values of parameters are constructed. Probabilistic properties of the constructed statistical inferences are given. The developed theory is also applied for statistical analysis of spatio-temporal data. Theoretical results are illustrated by computer experiments on real statistical data.

2014 ◽  
Vol 43 (3) ◽  
pp. 205-216 ◽  
Author(s):  
Yuriy Kharin ◽  
Mikhail Maltsau

The paper deals with finite Markov chain of conditional order, that is a special case of high-order Markov chain with a small number of parameters. Statistical estimators for parameters and statistical tests for parametric hypotheses are constructed and their properties are analyzed. Results of computer experiments on simulated and real data are presented. 


2020 ◽  
Author(s):  
Mieke Kuschnerus ◽  
Roderik Lindenbergh ◽  
Sander Vos

Abstract. Sandy coasts are constantly changing environments governed by complex interacting processes. Permanent laser scanning is a promising technique to monitor such coastal areas and support analysis of geomorphological deformation processes. This novel technique delivers 3D representations of a part of the coast at hourly temporal and centimetre spatial resolution and allows to observe small scale changes in elevation over extended periods of time. These observations have the potential to improve understanding and modelling of coastal deformation processes. However, to be of use to coastal researchers and coastal management, an efficient way to find and extract deformation processes from the large spatio-temporal data set is needed. In order to allow data mining in an automated way, we extract time series in elevation or range and use unsupervised learning algorithms to derive a partitioning of the observed area according to change patterns. We compare three well known clustering algorithms, k-means, agglomerative clustering and DBSCAN, and identify areas that undergo similar evolution during one month. We test if they fulfil our criteria for a suitable clustering algorithm on our exemplary data set. The three clustering methods are applied to time series of 30 epochs (during one month) extracted from a data set of daily scans covering a part of the coast at Kijkduin, the Netherlands. A small section of the beach, where a pile of sand was accumulated by a bulldozer is used to evaluate the performance of the algorithms against a ground truth. The k-means algorithm and agglomerative clustering deliver similar clusters, and both allow to identify a fixed number of dominant deformation processes in sandy coastal areas, such as sand accumulation by a bulldozer or erosion in the intertidal area. The DBSCAN algorithm finds clusters for only about 44 % of the area and turns out to be more suitable for the detection of outliers, caused for example by temporary objects on the beach. Our study provides a methodology to efficiently mine a spatio-temporal data set for predominant deformation patterns with the associated regions, where they occur.


2017 ◽  
Vol 9 (11) ◽  
pp. 1125 ◽  
Author(s):  
Chunhua Liao ◽  
Jinfei Wang ◽  
Ian Pritchard ◽  
Jiangui Liu ◽  
Jiali Shang

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Lianren Wu ◽  
Jinjie Li ◽  
Jiayin Qi

AbstractIn this paper, a quantitative temporal and spatial analysis of the dynamics of hot topics popularity in Micro-blogging system was provided. Firstly, the popularity time series of 1167 hot topics were counted and calculated by Excel. Secondly, based on MATLAB software,the popularity time series were clustered into six clusters by K-spectral centroid (K-SC) clustering algorithm. Thirdly, we analyzed temporal patterns and spatial patterns of popularity dynamics of topics by statistical methods. The results show that temporal popularity of micro-blogging topics is rapidly dying, and the distribution of popularity is subject to the power law form. In addition, most of the Micro-blogging topics are global topic. Our results can provide a literature reference for studying the influence of online hot topics and the evolution of public opinion.


Author(s):  
T. R. Kalugin ◽  
A. K. Kim ◽  
D. A. Petrusevich

In the paper the mathematical models describing connection between two time series are researched. At first each of them is investigated separately, and the ARIMA(p, d, q) model is constructed. These models are based on the time series characteristics obtained during the analysis stage. The connection between two time series is confirmed with the aid of cointegration statistical tests. Then the mathematical model of the connection between series is constructed. The ADL(p, q) model describes this dependence. It’s shown that for the time series under investigation the orders p, q of the ADL(p, q) model are connected with the ARIMA(p, d, q) orders of the  describing each series separately. This step makes the set of the investigated ADL(p, q) models much smaller. In the previous papers it was also shown that the ARIMA(p, d, q) automatical fitting functions in popular packages use limitations on the p, q orders of the time series process: q ≤ 5, p ≤ 5. The wish to use the simplest models is also built in the structure of the Akaike (AIC) and Bayes (BIC) informational criteria. In the paper the maximal values of the ADL(p, q) model orders are supposed to be the orders of the appropriate ARIMA(p, d, q) series. In the previous work it was shown that using high order ARIMA(p, d, q) it is possible to fit the models better. In this paper the experiments on the ADL(p, q) models construction are presented. The wage index and money income index time series pair is researched, and also the gas, water and energy production and consumption index/real agricultural production index pair is investigated. The data in the 2000–2018 time period is taken from the dynamic series of macroeconomic statistics of the Russian Federation.


2020 ◽  
Vol 9 (4) ◽  
pp. 210
Author(s):  
Xiaojing Wu ◽  
Donghai Zheng

Unprecedented amounts of spatio-temporal data instigates an urgent need for patterns exploration in it. Clustering analysis is useful in extracting patterns from big data by grouping similar data elements into clusters. Compared with one-way clustering and co-clustering methods, tri-clustering methods are more capable of exploring complex patterns. However, the explored patterns or clusters could be different due to varying temporal resolutions of input data. This study presents a tri-clustering based method to explore the impacts of different temporal resolutions on spatio-temporal clusters identified in geo-referenced time series (GTS), one type of spatio-temporal data. Dutch daily temperature data at 28 stations over 20 years was used to illustrate this study. The temperature data at daily, monthly, and yearly resolutions were subjected to the Bregman cube average tri-clustering algorithm with I-divergence (BCAT_I) to detect spatio-temporal clusters, which were then compared in terms of patterns exhibited, compositions, and changed elements. Results confirm the temporal resolution impacts on the spatio-temporal clusters identified in the Dutch temperature data: most compositions of clusters are varying when changing the temporal resolutions of input data in the GTS. Nevertheless, there is almost no change of elements in certain clusters (12 stations in the northeast of the country; years 1996, 2010) at all temporal resolutions, suggesting them as the “true” clusters in the case study dataset.


Sign in / Sign up

Export Citation Format

Share Document