scholarly journals Joint Modeling of Local and Global Temporal Dynamics for Multivariate Time Series Forecasting with Missing Values

2020 ◽  
Vol 34 (04) ◽  
pp. 5956-5963
Author(s):  
Xianfeng Tang ◽  
Huaxiu Yao ◽  
Yiwei Sun ◽  
Charu Aggarwal ◽  
Prasenjit Mitra ◽  
...  

Multivariate time series (MTS) forecasting is widely used in various domains, such as meteorology and traffic. Due to limitations on data collection, transmission, and storage, real-world MTS data usually contains missing values, making it infeasible to apply existing MTS forecasting models such as linear regression and recurrent neural networks. Though many efforts have been devoted to this problem, most of them solely rely on local dependencies for imputing missing values, which ignores global temporal dynamics. Local dependencies/patterns would become less useful when the missing ratio is high, or the data have consecutive missing values; while exploring global patterns can alleviate such problem. Thus, jointly modeling local and global temporal dynamics is very promising for MTS forecasting with missing values. However, work in this direction is rather limited. Therefore, we study a novel problem of MTS forecasting with missing values by jointly exploring local and global temporal dynamics. We propose a new framework øurs, which leverages memory network to explore global patterns given estimations from local perspectives. We further introduce adversarial training to enhance the modeling of global temporal distribution. Experimental results on real-world datasets show the effectiveness of øurs for MTS forecasting with missing values and its robustness under various missing ratios.

2020 ◽  
Vol 34 (01) ◽  
pp. 930-937
Author(s):  
Qingxiong Tan ◽  
Mang Ye ◽  
Baoyao Yang ◽  
Siqi Liu ◽  
Andy Jinhua Ma ◽  
...  

Due to the discrepancy of diseases and symptoms, patients usually visit hospitals irregularly and different physiological variables are examined at each visit, producing large amounts of irregular multivariate time series (IMTS) data with missing values and varying intervals. Existing methods process IMTS into regular data so that standard machine learning models can be employed. However, time intervals are usually determined by the status of patients, while missing values are caused by changes in symptoms. Therefore, we propose a novel end-to-end Dual-Attention Time-Aware Gated Recurrent Unit (DATA-GRU) for IMTS to predict the mortality risk of patients. In particular, DATA-GRU is able to: 1) preserve the informative varying intervals by introducing a time-aware structure to directly adjust the influence of the previous status in coordination with the elapsed time, and 2) tackle missing values by proposing a novel dual-attention structure to jointly consider data-quality and medical-knowledge. A novel unreliability-aware attention mechanism is designed to handle the diversity in the reliability of different data, while a new symptom-aware attention mechanism is proposed to extract medical reasons from original clinical records. Extensive experimental results on two real-world datasets demonstrate that DATA-GRU can significantly outperform state-of-the-art methods and provide meaningful clinical interpretation.


2019 ◽  
Vol 9 (15) ◽  
pp. 3041 ◽  
Author(s):  
Qianting Li ◽  
Yong Xu

Multivariate time series are often accompanied with missing values, especially in clinical time series, which usually contain more than 80% of missing data, and the missing rates between different variables vary widely. However, few studies address these missing rate differences and extract univariate missing patterns simultaneously before mixing them in the model training procedure. In this paper, we propose a novel recurrent neural network called variable sensitive GRU (VS-GRU), which utilizes the different missing rate of each variable as another input and learns the feature of different variables separately, reducing the harmful impact of variables with high missing rates. Experiments show that VS-GRU outperforms the state-of-the-art method in two real-world clinical datasets (MIMIC-III, PhysioNet).


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Alaa Sagheer ◽  
Mostafa Kotb

AbstractCurrently, most real-world time series datasets are multivariate and are rich in dynamical information of the underlying system. Such datasets are attracting much attention; therefore, the need for accurate modelling of such high-dimensional datasets is increasing. Recently, the deep architecture of the recurrent neural network (RNN) and its variant long short-term memory (LSTM) have been proven to be more accurate than traditional statistical methods in modelling time series data. Despite the reported advantages of the deep LSTM model, its performance in modelling multivariate time series (MTS) data has not been satisfactory, particularly when attempting to process highly non-linear and long-interval MTS datasets. The reason is that the supervised learning approach initializes the neurons randomly in such recurrent networks, disabling the neurons that ultimately must properly learn the latent features of the correlated variables included in the MTS dataset. In this paper, we propose a pre-trained LSTM-based stacked autoencoder (LSTM-SAE) approach in an unsupervised learning fashion to replace the random weight initialization strategy adopted in deep LSTM recurrent networks. For evaluation purposes, two different case studies that include real-world datasets are investigated, where the performance of the proposed approach compares favourably with the deep LSTM approach. In addition, the proposed approach outperforms several reference models investigating the same case studies. Overall, the experimental results clearly show that the unsupervised pre-training approach improves the performance of deep LSTM and leads to better and faster convergence than other models.


Author(s):  
Weida Zhong ◽  
Qiuling Suo ◽  
Abhishek Gupta ◽  
Xiaowei Jia ◽  
Chunming Qiao ◽  
...  

With the popularity of smartphones, large-scale road sensing data is being collected to perform traffic prediction, which is an important task in modern society. Due to the nature of the roving sensors on smartphones, the collected traffic data which is in the form of multivariate time series, is often temporally sparse and unevenly distributed across regions. Moreover, different regions can have different traffic patterns, which makes it challenging to adapt models learned from regions with sufficient training data to target regions. Given that many regions may have very sparse data, it is also impossible to build individual models for each region separately. In this paper, we propose a meta-learning based framework named MetaTP to overcome these challenges. MetaTP has two key parts, i.e., basic traffic prediction network (base model) and meta-knowledge transfer. In base model, a two-layer interpolation network is employed to map original time series onto uniformly-spaced reference time points, so that temporal prediction can be effectively performed in the reference space. The meta-learning framework is employed to transfer knowledge from source regions with a large amount of data to target regions with a few data examples via fast adaptation, in order to improve model generalizability on target regions. Moreover, we use two memory networks to capture the global patterns of spatial and temporal information across regions. We evaluate the proposed framework on two real-world datasets, and experimental results show the effectiveness of the proposed framework.


Author(s):  
Debarun Bhattacharjya ◽  
Dharmashankar Subramanian ◽  
Tian Gao

Many real-world domains involve co-evolving relationships between events, such as meals and exercise, and time-varying random variables, such as a patient's blood glucose levels. In this paper, we propose a general framework for modeling joint temporal dynamics involving continuous time transitions of discrete state variables and irregular arrivals of events over the timeline. We show how conditional Markov processes (as represented by continuous time Bayesian networks) and multivariate point processes (as represented by graphical event models) are among various processes that are covered by the framework. We introduce and compare two simple and interpretable yet practical joint models within the framework with relevant baselines on simulated and real-world datasets, using a graph search algorithm for learning. The experiments highlight the importance of jointly modeling event arrivals and state variable transitions to better fit joint temporal datasets, and the framework opens up possibilities for models involving even more complex dynamics whenever suitable.


Author(s):  
Thelma Dede Baddoo ◽  
Zhijia Li ◽  
Samuel Nii Odai ◽  
Kenneth Rodolphe Chabi Boni ◽  
Isaac Kwesi Nooni ◽  
...  

Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity of missing data reconstruction schemes to obtain the relevant results for a real-world single station streamflow observation to facilitate its further use. This investigation was implemented by applying different missing data mechanisms spanning from univariate algorithms to multiple imputation methods accustomed to multivariate data taking time as an explicit variable. The performance accuracy of these schemes was assessed using the total error measurement (TEM) and a recommended localized error measurement (LEM) in this study. The results show that univariate missing value algorithms, which are specially developed to handle univariate time series, provide satisfactory results, but the ones which provide the best results are usually time and computationally intensive. Also, multiple imputation algorithms which consider the surrounding observed values and/or which can understand the characteristics of the data provide similar results to the univariate missing data algorithms and, in some cases, perform better without the added time and computational downsides when time is taken as an explicit variable. Furthermore, the LEM would be especially useful when the missing data are in specific portions of the dataset or where very large gaps of ‘missingness’ occur. Finally, proper handling of missing values of real-world hydroclimatic datasets depends on imputing and extensive study of the particular dataset to be imputed.


2019 ◽  
Vol 18 (01) ◽  
pp. 241-286 ◽  
Author(s):  
Alper Ozcan ◽  
Sule Gunduz Oguducu

Link prediction is considered as one of the key tasks in various data mining applications for recommendation systems, bioinformatics, security and worldwide web. The majority of previous works in link prediction mainly focus on the homogeneous networks which only consider one type of node and link. However, real-world networks have heterogeneous interactions and complicated dynamic structure, which make link prediction a more challenging task. In this paper, we have studied the problem of link prediction in the dynamic, undirected, weighted/unweighted, heterogeneous social networks which are composed of multiple types of nodes and links that change over time. We propose a novel method, called Multivariate Time Series Link Prediction for evolving heterogeneous networks that incorporate (1) temporal evolution of the network; (2) correlations between link evolution and multi-typed relationships; (3) local and global similarity measures; and (4) node connectivity information. Our proposed method and the previously proposed time series methods are evaluated experimentally on a real-world bibliographic network (DBLP) and a social bookmarking network (Delicious). Experimental results show that the proposed method outperforms the previous methods in terms of AUC measures in different test cases.


2020 ◽  
Vol 4 (3) ◽  
pp. 88 ◽  
Author(s):  
Vadim Kapp ◽  
Marvin Carl May ◽  
Gisela Lanza ◽  
Thorsten Wuest

This paper presents a framework to utilize multivariate time series data to automatically identify reoccurring events, e.g., resembling failure patterns in real-world manufacturing data by combining selected data mining techniques. The use case revolves around the auxiliary polymer manufacturing process of drying and feeding plastic granulate to extrusion or injection molding machines. The overall framework presented in this paper includes a comparison of two different approaches towards the identification of unique patterns in the real-world industrial data set. The first approach uses a subsequent heuristic segmentation and clustering approach, the second branch features a collaborative method with a built-in time dependency structure at its core (TICC). Both alternatives have been facilitated by a standard principle component analysis PCA (feature fusion) and a hyperparameter optimization (TPE) approach. The performance of the corresponding approaches was evaluated through established and commonly accepted metrics in the field of (unsupervised) machine learning. The results suggest the existence of several common failure sources (patterns) for the machine. Insights such as these automatically detected events can be harnessed to develop an advanced monitoring method to predict upcoming failures, ultimately reducing unplanned machine downtime in the future.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Zhengping Che ◽  
Sanjay Purushotham ◽  
Kyunghyun Cho ◽  
David Sontag ◽  
Yan Liu

Sign in / Sign up

Export Citation Format

Share Document