Relational time series forecasting

Author(s):  
Ryan A. Rossi

AbstractNetworks encode dependencies between entities (people, computers, proteins) and allow us to study phenomena across social, technological, and biological domains. These networks naturally evolve over time by the addition, deletion, and changing of links, nodes, and attributes. Despite the importance of modeling these dynamics, existing work in relational machine learning has ignored relational time series data. Relational time series learning lies at the intersection of traditional time series analysis and statistical relational learning, and bridges the gap between these two fundamentally important problems. This paper formulates the relational time series learning problem, and a general framework and taxonomy for representation discovery tasks of both nodes and links including predicting their existence, label, and weight (importance), as well as systematically constructing features. We also reinterpret the prediction task leading to the proposal of two important relational time series forecasting tasks consisting of (i) relational time series classification (predicts a future class or label of an entity), and (ii) relational time series regression (predicts a future real-valued attribute or weight). Relational time series models are designed to leverage both relational and temporal dependencies to minimize forecasting error for both relational time series classification and regression. Finally, we discuss challenges and open problems that remain to be addressed.

Author(s):  
Pantelis Samartsidis ◽  
Natasha N. Martin ◽  
Victor De Gruttola ◽  
Frank De Vocht ◽  
Sharon Hutchinson ◽  
...  

Abstract Objectives The causal impact method (CIM) was recently introduced for evaluation of binary interventions using observational time-series data. The CIM is appealing for practical use as it can adjust for temporal trends and account for the potential of unobserved confounding. However, the method was initially developed for applications involving large datasets and hence its potential in small epidemiological studies is still unclear. Further, the effects that measurement error can have on the performance of the CIM have not been studied yet. The objective of this work is to investigate both of these open problems. Methods Motivated by an existing dataset of HCV surveillance in the UK, we perform simulation experiments to investigate the effect of several characteristics of the data on the performance of the CIM. Further, we quantify the effects of measurement error on the performance of the CIM and extend the method to deal with this problem. Results We identify multiple characteristics of the data that affect the ability of the CIM to detect an intervention effect including the length of time-series, the variability of the outcome and the degree of correlation between the outcome of the treated unit and the outcomes of controls. We show that measurement error can introduce biases in the estimated intervention effects and heavily reduce the power of the CIM. Using an extended CIM, some of these adverse effects can be mitigated. Conclusions The CIM can provide satisfactory power in public health interventions. The method may provide misleading results in the presence of measurement error.


Author(s):  
Elangovan Ramanujam ◽  
S. Padmavathi

Innovations and applicability of time series data mining techniques have significantly increased the researchers' interest in the problem of time series classification. Several algorithms have been proposed for this purpose categorized under shapelet, interval, motif, and whole series-based techniques. Among this, the bag-of-words technique, an extensive application of the text mining approach, performs well due to its simplicity and effectiveness. To extend the efficiency of the bag-of-words technique, this paper proposes a discriminate supervised weighted scheme to identify the characteristic and representative pattern of a class for efficient classification. This paper uses a modified weighted matrix that discriminates the representative and non-representative pattern which enables the interpretability in classification. Experimentation has been carried out to compare the performance of the proposed technique with state-of-the-art techniques in terms of accuracy and statistical significance.


2020 ◽  
Vol 34 (10) ◽  
pp. 13720-13721
Author(s):  
Won Kyung Lee

A multivariate time-series forecasting has great potentials in various domains. However, it is challenging to find dependency structure among the time-series variables and appropriate time-lags for each variable, which change dynamically over time. In this study, I suggest partial correlation-based attention mechanism which overcomes the shortcomings of existing pair-wise comparisons-based attention mechanisms. Moreover, I propose data-driven series-wise multi-resolution convolutional layers to represent the input time-series data for domain agnostic learning.


Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1908
Author(s):  
Chao Ma ◽  
Xiaochuan Shi ◽  
Wei Li ◽  
Weiping Zhu

In the past decade, time series data have been generated from various fields at a rapid speed, which offers a huge opportunity for mining valuable knowledge. As a typical task of time series mining, Time Series Classification (TSC) has attracted lots of attention from both researchers and domain experts due to its broad applications ranging from human activity recognition to smart city governance. Specifically, there is an increasing requirement for performing classification tasks on diverse types of time series data in a timely manner without costly hand-crafting feature engineering. Therefore, in this paper, we propose a framework named Edge4TSC that allows time series to be processed in the edge environment, so that the classification results can be instantly returned to the end-users. Meanwhile, to get rid of the costly hand-crafting feature engineering process, deep learning techniques are applied for automatic feature extraction, which shows competitive or even superior performance compared to state-of-the-art TSC solutions. However, because time series presents complex patterns, even deep learning models are not capable of achieving satisfactory classification accuracy, which motivated us to explore new time series representation methods to help classifiers further improve the classification accuracy. In the proposed framework Edge4TSC, by building the binary distribution tree, a new time series representation method was designed for addressing the classification accuracy concern in TSC tasks. By conducting comprehensive experiments on six challenging time series datasets in the edge environment, the potential of the proposed framework for its generalization ability and classification accuracy improvement is firmly validated with a number of helpful insights.


Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 288
Author(s):  
Kuiyong Song ◽  
Nianbin Wang ◽  
Hongbin Wang

High-dimensional time series classification is a serious problem. A similarity measure based on distance is one of the methods for time series classification. This paper proposes a metric learning-based univariate time series classification method (ML-UTSC), which uses a Mahalanobis matrix on metric learning to calculate the local distance between multivariate time series and combines Dynamic Time Warping(DTW) and the nearest neighbor classification to achieve the final classification. In this method, the features of the univariate time series are presented as multivariate time series data with a mean value, variance, and slope. Next, a three-dimensional Mahalanobis matrix is obtained based on metric learning in the data. The time series is divided into segments of equal intervals to enable the Mahalanobis matrix to more accurately describe the features of the time series data. Compared with the most effective measurement method, the related experimental results show that our proposed algorithm has a lower classification error rate in most of the test datasets.


2016 ◽  
Vol 26 (09n10) ◽  
pp. 1361-1377 ◽  
Author(s):  
Daoyuan Li ◽  
Tegawende F. Bissyande ◽  
Jacques Klein ◽  
Yves Le Traon

Time series mining has become essential for extracting knowledge from the abundant data that flows out from many application domains. To overcome storage and processing challenges in time series mining, compression techniques are being used. In this paper, we investigate the loss/gain of performance of time series classification approaches when fed with lossy-compressed data. This extended empirical study is essential for reassuring practitioners, but also for providing more insights on how compression techniques can even be effective in smoothing and reducing noise in time series data. From a knowledge engineering perspective, we show that time series may be compressed by 90% using discrete wavelet transforms and still achieve remarkable classification accuracy, and that residual details left by popular wavelet compression techniques can sometimes even help to achieve higher classification accuracy than the raw time series data, as they better capture essential local features.


2021 ◽  
Vol 2115 (1) ◽  
pp. 012044
Author(s):  
R. Vaibhava Lakshmi ◽  
S. Radha

Abstract The time series forecasting strategy, Auto-Regressive Integrated Moving Average (ARIMA) model, is applied on the time series data consisting of Adobe stock prices, in order to forecast the future prices for a period of one year. ARIMA model is used due to its simple and flexible implementation for short term predictions of future stock prices. In order to achieve stationarity, the time series data requires second-order differencing. The comparison and parameterization of the ARIMA model has been done using auto-correlation plot, partial auto-correlation plot and auto.arima() function provided in R (which automatically finds the best fitting model based on the AIC and BIC values). The ARIMA (0, 2, 1) (0, 0, 2) [12] is chosen as the best fitting model, with a very less MAPE (Mean Absolute Percentage Error) of 3.854958%.


2021 ◽  
Vol 7 ◽  
pp. e534
Author(s):  
Kristoko Dwi Hartomo ◽  
Yessica Nataliani

This paper aims to propose a new model for time series forecasting that combines forecasting with clustering algorithm. It introduces a new scheme to improve the forecasting results by grouping the time series data using k-means clustering algorithm. It utilizes the clustering result to get the forecasting data. There are usually some user-defined parameters affecting the forecasting results, therefore, a learning-based procedure is proposed to estimate the parameters that will be used for forecasting. This parameter value is computed in the algorithm simultaneously. The result of the experiment compared to other forecasting algorithms demonstrates good results for the proposed model. It has the smallest mean squared error of 13,007.91 and the average improvement rate of 19.83%.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Jitao Zhang ◽  
Weiming Shen ◽  
Liang Gao ◽  
Xinyu Li ◽  
Long Wen

Time series classification is a basic and important approach for time series data mining. Nowadays, more researchers pay attention to the shape similarity method including Shapelet-based algorithms because it can extract discriminative subsequences from time series. However, most Shapelet-based algorithms discover Shapelets by searching candidate subsequences in training datasets, which brings two drawbacks: high computational burden and poor generalization ability. To overcome these drawbacks, this paper proposes a novel algorithm named Shapelet Dictionary Learning with SVM-based Ensemble Classifier (SDL-SEC). SDL-SEC modifies the Shapelet algorithm from two aspects: Shapelet discovery method and classifier. Firstly, a Shapelet Dictionary Learning (SDL) is proposed as a novel Shapelet discovery method to generate Shapelets instead of searching them. In this way, SDL owns the advantages of lower computational cost and higher generalization ability. Then, an SVM-based Ensemble Classifier (SEC) is developed as a novel ensemble classifier and adapted to the SDL algorithm. Different from the classic SVM that needs precise parameters tuning and appropriate features selection, SEC can avoid overfitting caused by a large number of features and parameters. Compared with the baselines on 45 datasets, the proposed SDL-SEC algorithm achieves a competitive classification accuracy with lower computational cost.


2019 ◽  
Vol 10 (3) ◽  
pp. 915
Author(s):  
Ali Ebrahimi Ghahnavieh

Every player in the market has a greater need to know about the smallest change in the market. Therefore, the ability to see what is ahead is a valuable advantage. The purpose of this research is to make an attempt to understand the behavioral patterns and try to find a new hybrid forecasting approach based on ARIMA-ANN for estimating styrene price. The time series analysis and forecasting is an essential tool which could be widely useful for finding the significant characteristics for making future decisions. In this study ARIMA, ANN and Hybrid ARIMA-ANN models were applied to evaluate the previous behavior of a time series data, in order to make interpretations about its future behavior for styrene price. Experimental results with real data sets show that the combined model can be most suitable to improve forecasting accurateness rather than traditional time series forecasting methodologies. As a subset of the literature, the small number of studies have been done to realize the new forecasting methods for forecasting styrene price.


Sign in / Sign up

Export Citation Format

Share Document