Experimental Comparison of Some Classical Distance Measures for Time Series Data in Simulation Model Validation

In this chapter, we present genetic algorithm (GA) based methods developed for clustering univariate time series with equal or unequal length as an exploratory step of data mining. These methods basically implement the k-medoids algorithm. Each chromosome encodes in binary the data objects serving as the k-medoids. To compare their performance, both fixed-parameter and adaptive GAs were used. We first employed the synthetic control chart data set to investigate the performance of three fitness functions, two distance measures, and other GA parameters such as population size, crossover rate, and mutation rate. Two more sets of time series with or without known number of clusters were also experimented: one is the cylinder-bell-funnel data and the other is the novel battle simulation data. The clustering results are presented and discussed.

Download Full-text

Free congruence: an exploration of expanded similarity measures for time series data

10.21203/rs.3.rs-163245/v1 ◽

2021 ◽

Author(s):

Lucas Cassiel Jacaruso

Keyword(s):

Time Series ◽

Dynamic Time Warping ◽

Time Series Data ◽

Similarity Measures ◽

Distance Measures ◽

Series Data ◽

Time Warping ◽

Point To Point ◽

Dynamic Time ◽

Point Distance

Abstract Time series similarity measures are highly relevant in a wide range of emerging applications including training machine learning models, classification, and predictive modeling. Standard similarity measures for time series most often involve point-to-point distance measures including Euclidean distance and Dynamic Time Warping. Such similarity measures fundamentally require the fluctuation of values in the time series being compared to follow a corresponding order or cadence for similarity to be established. Other existing approaches use local statistical tests to detect structural changes in time series. This paper is spurred by the exploration of a broader definition of similarity, namely one that takes into account the sheer numerical resemblance between sets of statistical properties for time series segments irrespectively of value labeling. Further, the presence of common pattern components between time series segments was examined even if they occur in a permuted order, which would not necessarily satisfy the criteria of more conventional point-to-point distance measures. The newly defined similarity measures were tested on time series data representing over 20 years of cooperation intent expressed in global media sentiment. Tests determined whether the newly defined similarity measures would accurately identify stronger resemblance, on average, for pairings of similar time series segments (exhibiting overall decline) than pairings of differing segments (exhibiting overall decline and overall rise). The ability to identify patterns other than the obvious overall rise or decline that can accurately relate samples is regarded as a first step towards assessing the value of the newly explored similarity measures for classification or prediction. Results were compared with those of Dynamic Time Warping on the same data for context. Surprisingly, the test for numerical resemblance between sets of statistical properties established stronger resemblance for pairings of decline years with greater statistical significance than Dynamic Time Warping on the particular data and sample size used.

Download Full-text

Causality Distance Measures for Multivariate Time Series with Applications

Mathematics ◽

10.3390/math9212708 ◽

2021 ◽

Vol 9 (21) ◽

pp. 2708

Author(s):

Achilleas Anastasiou ◽

Peter Hatzopoulos ◽

Alex Karagrigoriou ◽

George Mavridoglou

Keyword(s):

Time Series ◽

Time Series Data ◽

Distance Measure ◽

Multivariate Time Series ◽

Health Resources ◽

Unsupervised Classification ◽

Distance Measures ◽

Series Data ◽

Multivariate Statistical

In this work, we focus on the development of new distance measure algorithms, namely, the Causality Within Groups (CAWG), the Generalized Causality Within Groups (GCAWG) and the Causality Between Groups (CABG), all of which are based on the well-known Granger causality. The proposed distances together with the associated algorithms are suitable for multivariate statistical data analysis including unsupervised classification (clustering) purposes for the analysis of multivariate time series data with emphasis on financial and economic data where causal relationships are frequently present. For exploring the appropriateness of the proposed methodology, we implement, for illustrative purposes, the proposed algorithms to hierarchical clustering for the classification of 19 EU countries based on seven variables related to health resources in healthcare systems.

Download Full-text

Exploratory Time Series Data Mining by Genetic Clustering

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch055 ◽

2008 ◽

pp. 942-962

Author(s):

T. Warren Liao

Keyword(s):

Data Mining ◽

Time Series ◽

Time Series Data ◽

Distance Measures ◽

Series Data ◽

Synthetic Control ◽

Data Set ◽

Univariate Time Series ◽

Genetic Clustering ◽

Data Objects

In this chapter, we present genetic algorithm (GA) based methods developed for clustering univariate time series with equal or unequal length as an exploratory step of data mining. These methods basically implement the k-medoids algorithm. Each chromosome encodes in binary the data objects serving as the k-medoids. To compare their performance, both fixed-parameter and adaptive GAs were used. We first employed the synthetic control chart data set to investigate the performance of three fitness functions, two distance measures, and other GA parameters such as population size, crossover rate, and mutation rate. Two more sets of time series with or without known number of clusters were also experimented: one is the cylinder-bell-funnel data and the other is the novel battle simulation data. The clustering results are presented and discussed.

Download Full-text

Anomaly detection in multidimensional time series— A graph-based approach

Journal of Physics: Complexity ◽

10.1088/2632-072x/ac392c ◽

2021 ◽

Author(s):

Marcus Erz ◽

Jeremy Floyd Kielman ◽

Bahar Selvi Uzun ◽

Gabriele Stefanie Guehring

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Distance Measures ◽

Series Data ◽

Data Set ◽

Research Areas ◽

Multidimensional Time Series ◽

Wide Range ◽

Time Frames

Abstract As the digital transformation is taking place, more and more data is being generated and collected.To generate meaningful information and knowledge researchers use various data mining techniques. In addition to classification, clustering, and forecasting, outlier or anomaly detection is one of the most important research areas in time series analysis. In this paper we present a method for detecting anomalies in multidimensional time series using a graph-based algorithm. We transform time series data to graphs prior to calculating the outlier since it offers a wide range of graph-based methods for anomaly detection. Furthermore the dynamics of the data is taken into consideration by implementing a window of a certain size that leads to multiple graphs in different time frames. We use feature extraction and aggregation to finally compare distance measures of two time-dependent graphs. The effectiveness of our algorithm is demonstrated on the Numenta Anomaly Benchmark with various anomaly types as well as the KPI-Anomaly-Detection data set of 2018 AIOps competition.

Download Full-text

Designing a Framework to Improve Time Series Data of Construction Projects: Application of a Simulation Model and Singular Spectrum Analysis

Algorithms ◽

10.3390/a9030045 ◽

2016 ◽

Vol 9 (3) ◽

pp. 45 ◽

Cited By ~ 1

Author(s):

Zahra Hojjati Tavassoli ◽

Seyed Iranmanesh ◽

Ahmad Tavassoli Hojjati

Keyword(s):

Time Series ◽

Simulation Model ◽

Spectrum Analysis ◽

Construction Projects ◽

Time Series Data ◽

Singular Spectrum Analysis ◽

Series Data ◽

Singular Spectrum

Download Full-text

Graphical Exploratory Data Analysis for Categorical Longitudinal and Time Series Data

PsycEXTRA Dataset ◽

10.1037/e634372013-001 ◽

2013 ◽

Author(s):

Stephen J. Tueller ◽

Richard A. Van Dorn ◽

Georgiy Bobashev ◽

Barry Eggleston

Keyword(s):

Time Series ◽

Data Analysis ◽

Exploratory Data Analysis ◽

Time Series Data ◽

Series Data ◽

Exploratory Data

Download Full-text

Faktor-Faktor Yang Mempengaruhi Nilai Tukar Dollar Amerika Serikat Terhadap Rupiah Tahun 2000–2013

Jurnal Riset Manajemen Sekolah Tinggi Ilmu Ekonomi Widya Wiwaha Program Magister Manajemen ◽

10.32477/jrm.v1i2.72 ◽

2017 ◽

Vol 1 (2) ◽

pp. 177-191

Author(s):

Rizki Rahma Kusumadewi ◽

Wahyu Widayat

Keyword(s):

Time Series ◽

Exchange Rate ◽

Money Supply ◽

Time Series Data ◽

The United States ◽

Economic Conditions ◽

Series Data ◽

Arch Model ◽

United States Dollar ◽

The Exchange Rate

Exchange rate is one tool to measure a country’s economic conditions. The growth of a stable currency value indicates that the country has a relatively good economic conditions or stable. This study has the purpose to analyze the factors that affect the exchange rate of the Indonesian Rupiah against the United States Dollar in the period of 2000-2013. The data used in this study is a secondary data which are time series data, made up of exports, imports, inflation, the BI rate, Gross Domestic Product (GDP), and the money supply (M1) in the quarter base, from first quarter on 2000 to fourth quarter on 2013. Regression model time series data used the ARCH-GARCH with ARCH model selection indicates that the variables that significantly influence the exchange rate are exports, inflation, the central bank rate and the money supply (M1). Whereas import and GDP did not give any influence.

Download Full-text

An Anomaly Detection Method with Exemplar Subsequence for Time Series Data

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.136.363 ◽

2016 ◽

Vol 136 (3) ◽

pp. 363-372

Author(s):

Takaaki Nakamura ◽

Makoto Imamura ◽

Masashi Tatedoko ◽

Norio Hirai

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Detection Method ◽

Series Data

Download Full-text

Experimental Comparison of Some Classical Distance Measures for Time Series Data in Simulation Model Validation

Experimental comparison of representation methods and distance measures for time series data

Exploratory Time Series Data Mining by Genetic Clustering

Free congruence: an exploration of expanded similarity measures for time series data

Causality Distance Measures for Multivariate Time Series with Applications

Exploratory Time Series Data Mining by Genetic Clustering

Anomaly detection in multidimensional time series— A graph-based approach

Designing a Framework to Improve Time Series Data of Construction Projects: Application of a Simulation Model and Singular Spectrum Analysis

Graphical Exploratory Data Analysis for Categorical Longitudinal and Time Series Data

Faktor-Faktor Yang Mempengaruhi Nilai Tukar Dollar Amerika Serikat Terhadap Rupiah Tahun 2000–2013

An Anomaly Detection Method with Exemplar Subsequence for Time Series Data

Export Citation Format