Experimental Comparison of Some Classical Distance Measures for Time Series Data in Simulation Model Validation

Author(s):  
Xiaojun Yang ◽  
Zhongfu Xu ◽  
Haibo Ouyang ◽  
Xing Zhang
2012 ◽  
Vol 26 (2) ◽  
pp. 275-309 ◽  
Author(s):  
Xiaoyue Wang ◽  
Abdullah Mueen ◽  
Hui Ding ◽  
Goce Trajcevski ◽  
Peter Scheuermann ◽  
...  

Author(s):  
T. Warren Liao

In this chapter, we present genetic algorithm (GA) based methods developed for clustering univariate time series with equal or unequal length as an exploratory step of data mining. These methods basically implement the k-medoids algorithm. Each chromosome encodes in binary the data objects serving as the k-medoids. To compare their performance, both fixed-parameter and adaptive GAs were used. We first employed the synthetic control chart data set to investigate the performance of three fitness functions, two distance measures, and other GA parameters such as population size, crossover rate, and mutation rate. Two more sets of time series with or without known number of clusters were also experimented: one is the cylinder-bell-funnel data and the other is the novel battle simulation data. The clustering results are presented and discussed.


2021 ◽  
Author(s):  
Lucas Cassiel Jacaruso

Abstract Time series similarity measures are highly relevant in a wide range of emerging applications including training machine learning models, classification, and predictive modeling. Standard similarity measures for time series most often involve point-to-point distance measures including Euclidean distance and Dynamic Time Warping. Such similarity measures fundamentally require the fluctuation of values in the time series being compared to follow a corresponding order or cadence for similarity to be established. Other existing approaches use local statistical tests to detect structural changes in time series. This paper is spurred by the exploration of a broader definition of similarity, namely one that takes into account the sheer numerical resemblance between sets of statistical properties for time series segments irrespectively of value labeling. Further, the presence of common pattern components between time series segments was examined even if they occur in a permuted order, which would not necessarily satisfy the criteria of more conventional point-to-point distance measures. The newly defined similarity measures were tested on time series data representing over 20 years of cooperation intent expressed in global media sentiment. Tests determined whether the newly defined similarity measures would accurately identify stronger resemblance, on average, for pairings of similar time series segments (exhibiting overall decline) than pairings of differing segments (exhibiting overall decline and overall rise). The ability to identify patterns other than the obvious overall rise or decline that can accurately relate samples is regarded as a first step towards assessing the value of the newly explored similarity measures for classification or prediction. Results were compared with those of Dynamic Time Warping on the same data for context. Surprisingly, the test for numerical resemblance between sets of statistical properties established stronger resemblance for pairings of decline years with greater statistical significance than Dynamic Time Warping on the particular data and sample size used.


Mathematics ◽  
2021 ◽  
Vol 9 (21) ◽  
pp. 2708
Author(s):  
Achilleas Anastasiou ◽  
Peter Hatzopoulos ◽  
Alex Karagrigoriou ◽  
George Mavridoglou

In this work, we focus on the development of new distance measure algorithms, namely, the Causality Within Groups (CAWG), the Generalized Causality Within Groups (GCAWG) and the Causality Between Groups (CABG), all of which are based on the well-known Granger causality. The proposed distances together with the associated algorithms are suitable for multivariate statistical data analysis including unsupervised classification (clustering) purposes for the analysis of multivariate time series data with emphasis on financial and economic data where causal relationships are frequently present. For exploring the appropriateness of the proposed methodology, we implement, for illustrative purposes, the proposed algorithms to hierarchical clustering for the classification of 19 EU countries based on seven variables related to health resources in healthcare systems.


2008 ◽  
pp. 942-962
Author(s):  
T. Warren Liao

In this chapter, we present genetic algorithm (GA) based methods developed for clustering univariate time series with equal or unequal length as an exploratory step of data mining. These methods basically implement the k-medoids algorithm. Each chromosome encodes in binary the data objects serving as the k-medoids. To compare their performance, both fixed-parameter and adaptive GAs were used. We first employed the synthetic control chart data set to investigate the performance of three fitness functions, two distance measures, and other GA parameters such as population size, crossover rate, and mutation rate. Two more sets of time series with or without known number of clusters were also experimented: one is the cylinder-bell-funnel data and the other is the novel battle simulation data. The clustering results are presented and discussed.


Author(s):  
Marcus Erz ◽  
Jeremy Floyd Kielman ◽  
Bahar Selvi Uzun ◽  
Gabriele Stefanie Guehring

Abstract As the digital transformation is taking place, more and more data is being generated and collected.To generate meaningful information and knowledge researchers use various data mining techniques. In addition to classification, clustering, and forecasting, outlier or anomaly detection is one of the most important research areas in time series analysis. In this paper we present a method for detecting anomalies in multidimensional time series using a graph-based algorithm. We transform time series data to graphs prior to calculating the outlier since it offers a wide range of graph-based methods for anomaly detection. Furthermore the dynamics of the data is taken into consideration by implementing a window of a certain size that leads to multiple graphs in different time frames. We use feature extraction and aggregation to finally compare distance measures of two time-dependent graphs. The effectiveness of our algorithm is demonstrated on the Numenta Anomaly Benchmark with various anomaly types as well as the KPI-Anomaly-Detection data set of 2018 AIOps competition.


2013 ◽  
Author(s):  
Stephen J. Tueller ◽  
Richard A. Van Dorn ◽  
Georgiy Bobashev ◽  
Barry Eggleston

Author(s):  
Rizki Rahma Kusumadewi ◽  
Wahyu Widayat

Exchange rate is one tool to measure a country’s economic conditions. The growth of a stable currency value indicates that the country has a relatively good economic conditions or stable. This study has the purpose to analyze the factors that affect the exchange rate of the Indonesian Rupiah against the United States Dollar in the period of 2000-2013. The data used in this study is a secondary data which are time series data, made up of exports, imports, inflation, the BI rate, Gross Domestic Product (GDP), and the money supply (M1) in the quarter base, from first quarter on 2000 to fourth quarter on 2013. Regression model time series data used the ARCH-GARCH with ARCH model selection indicates that the variables that significantly influence the exchange rate are exports, inflation, the central bank rate and the money supply (M1). Whereas import and GDP did not give any influence.


2016 ◽  
Vol 136 (3) ◽  
pp. 363-372
Author(s):  
Takaaki Nakamura ◽  
Makoto Imamura ◽  
Masashi Tatedoko ◽  
Norio Hirai

Sign in / Sign up

Export Citation Format

Share Document