Multivariate Time Series Link Prediction for Evolving Heterogeneous Network

2019 ◽  
Vol 18 (01) ◽  
pp. 241-286 ◽  
Author(s):  
Alper Ozcan ◽  
Sule Gunduz Oguducu

Link prediction is considered as one of the key tasks in various data mining applications for recommendation systems, bioinformatics, security and worldwide web. The majority of previous works in link prediction mainly focus on the homogeneous networks which only consider one type of node and link. However, real-world networks have heterogeneous interactions and complicated dynamic structure, which make link prediction a more challenging task. In this paper, we have studied the problem of link prediction in the dynamic, undirected, weighted/unweighted, heterogeneous social networks which are composed of multiple types of nodes and links that change over time. We propose a novel method, called Multivariate Time Series Link Prediction for evolving heterogeneous networks that incorporate (1) temporal evolution of the network; (2) correlations between link evolution and multi-typed relationships; (3) local and global similarity measures; and (4) node connectivity information. Our proposed method and the previously proposed time series methods are evaluated experimentally on a real-world bibliographic network (DBLP) and a social bookmarking network (Delicious). Experimental results show that the proposed method outperforms the previous methods in terms of AUC measures in different test cases.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Seyed Hossein Jafari ◽  
Amir Mahdi Abdolhosseini-Qomi ◽  
Masoud Asadpour ◽  
Maseud Rahgozar ◽  
Naser Yazdani

AbstractThe entities of real-world networks are connected via different types of connections (i.e., layers). The task of link prediction in multiplex networks is about finding missing connections based on both intra-layer and inter-layer correlations. Our observations confirm that in a wide range of real-world multiplex networks, from social to biological and technological, a positive correlation exists between connection probability in one layer and similarity in other layers. Accordingly, a similarity-based automatic general-purpose multiplex link prediction method—SimBins—is devised that quantifies the amount of connection uncertainty based on observed inter-layer correlations in a multiplex network. Moreover, SimBins enhances the prediction quality in the target layer by incorporating the effect of link overlap across layers. Applying SimBins to various datasets from diverse domains, our findings indicate that SimBins outperforms the compared methods (both baseline and state-of-the-art methods) in most instances when predicting links. Furthermore, it is discussed that SimBins imposes minor computational overhead to the base similarity measures making it a potentially fast method, suitable for large-scale multiplex networks.


2020 ◽  
Vol 34 (04) ◽  
pp. 5956-5963
Author(s):  
Xianfeng Tang ◽  
Huaxiu Yao ◽  
Yiwei Sun ◽  
Charu Aggarwal ◽  
Prasenjit Mitra ◽  
...  

Multivariate time series (MTS) forecasting is widely used in various domains, such as meteorology and traffic. Due to limitations on data collection, transmission, and storage, real-world MTS data usually contains missing values, making it infeasible to apply existing MTS forecasting models such as linear regression and recurrent neural networks. Though many efforts have been devoted to this problem, most of them solely rely on local dependencies for imputing missing values, which ignores global temporal dynamics. Local dependencies/patterns would become less useful when the missing ratio is high, or the data have consecutive missing values; while exploring global patterns can alleviate such problem. Thus, jointly modeling local and global temporal dynamics is very promising for MTS forecasting with missing values. However, work in this direction is rather limited. Therefore, we study a novel problem of MTS forecasting with missing values by jointly exploring local and global temporal dynamics. We propose a new framework øurs, which leverages memory network to explore global patterns given estimations from local perspectives. We further introduce adversarial training to enhance the modeling of global temporal distribution. Experimental results on real-world datasets show the effectiveness of øurs for MTS forecasting with missing values and its robustness under various missing ratios.


2020 ◽  
Vol 4 (3) ◽  
pp. 88 ◽  
Author(s):  
Vadim Kapp ◽  
Marvin Carl May ◽  
Gisela Lanza ◽  
Thorsten Wuest

This paper presents a framework to utilize multivariate time series data to automatically identify reoccurring events, e.g., resembling failure patterns in real-world manufacturing data by combining selected data mining techniques. The use case revolves around the auxiliary polymer manufacturing process of drying and feeding plastic granulate to extrusion or injection molding machines. The overall framework presented in this paper includes a comparison of two different approaches towards the identification of unique patterns in the real-world industrial data set. The first approach uses a subsequent heuristic segmentation and clustering approach, the second branch features a collaborative method with a built-in time dependency structure at its core (TICC). Both alternatives have been facilitated by a standard principle component analysis PCA (feature fusion) and a hyperparameter optimization (TPE) approach. The performance of the corresponding approaches was evaluated through established and commonly accepted metrics in the field of (unsupervised) machine learning. The results suggest the existence of several common failure sources (patterns) for the machine. Insights such as these automatically detected events can be harnessed to develop an advanced monitoring method to predict upcoming failures, ultimately reducing unplanned machine downtime in the future.


Author(s):  
Marisa Mohr ◽  
Florian Wilhelm ◽  
Ralf Möller

The estimation of the qualitative behaviour of fractional Brownian motion is an important topic for modelling real-world applications. Permutation entropy is a well-known approach to quantify the complexity of univariate time series in a scalar-valued representation. As an extension often used for outlier detection, weighted permutation entropy takes amplitudes within time series into account. As many real-world problems deal with multivariate time series, these measures need to be extended though. First, we introduce multivariate weighted permutation entropy, which is consistent with standard multivariate extensions of permutation entropy. Second, we investigate the behaviour of weighted permutation entropy on both univariate and multivariate fractional Brownian motion and show revealing results.


2014 ◽  
Vol 2014 ◽  
pp. 1-8 ◽  
Author(s):  
Jimin Wang ◽  
Yuelong Zhu ◽  
Shijin Li ◽  
Dingsheng Wan ◽  
Pengcheng Zhang

Multivariate time series (MTS) datasets are very common in various financial, multimedia, and hydrological fields. In this paper, a dimension-combination method is proposed to search similar sequences for MTS. Firstly, the similarity of single-dimension series is calculated; then the overall similarity of the MTS is obtained by synthesizing each of the single-dimension similarity based on weighted BORDA voting method. The dimension-combination method could use the existing similarity searching method. Several experiments, which used the classification accuracy as a measure, were performed on six datasets from the UCI KDD Archive to validate the method. The results show the advantage of the approach compared to the traditional similarity measures, such as Euclidean distance (ED), cynamic time warping (DTW), point distribution (PD), PCA similarity factorSPCA, and extended Frobenius norm (Eros), for MTS datasets in some ways. Our experiments also demonstrate that no measure can fit all datasets, and the proposed measure is a choice for similarity searches.


2019 ◽  
Vol 9 (15) ◽  
pp. 3041 ◽  
Author(s):  
Qianting Li ◽  
Yong Xu

Multivariate time series are often accompanied with missing values, especially in clinical time series, which usually contain more than 80% of missing data, and the missing rates between different variables vary widely. However, few studies address these missing rate differences and extract univariate missing patterns simultaneously before mixing them in the model training procedure. In this paper, we propose a novel recurrent neural network called variable sensitive GRU (VS-GRU), which utilizes the different missing rate of each variable as another input and learns the feature of different variables separately, reducing the harmful impact of variables with high missing rates. Experiments show that VS-GRU outperforms the state-of-the-art method in two real-world clinical datasets (MIMIC-III, PhysioNet).


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Alaa Sagheer ◽  
Mostafa Kotb

AbstractCurrently, most real-world time series datasets are multivariate and are rich in dynamical information of the underlying system. Such datasets are attracting much attention; therefore, the need for accurate modelling of such high-dimensional datasets is increasing. Recently, the deep architecture of the recurrent neural network (RNN) and its variant long short-term memory (LSTM) have been proven to be more accurate than traditional statistical methods in modelling time series data. Despite the reported advantages of the deep LSTM model, its performance in modelling multivariate time series (MTS) data has not been satisfactory, particularly when attempting to process highly non-linear and long-interval MTS datasets. The reason is that the supervised learning approach initializes the neurons randomly in such recurrent networks, disabling the neurons that ultimately must properly learn the latent features of the correlated variables included in the MTS dataset. In this paper, we propose a pre-trained LSTM-based stacked autoencoder (LSTM-SAE) approach in an unsupervised learning fashion to replace the random weight initialization strategy adopted in deep LSTM recurrent networks. For evaluation purposes, two different case studies that include real-world datasets are investigated, where the performance of the proposed approach compares favourably with the deep LSTM approach. In addition, the proposed approach outperforms several reference models investigating the same case studies. Overall, the experimental results clearly show that the unsupervised pre-training approach improves the performance of deep LSTM and leads to better and faster convergence than other models.


2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
Thi-Thu-Hong Phan ◽  
André Bigand ◽  
Émilie Poisson Caillault

The completion of missing values is a prevalent problem in many domains of pattern recognition and signal processing. Analyzing data with incompleteness may lead to a loss of power and unreliable results, especially for large missing subsequence(s). Therefore, this paper aims to introduce a new approach for filling successive missing values in low/uncorrelated multivariate time series which allows managing a high level of uncertainty. In this way, we propose using a novel fuzzy weighting-based similarity measure. The proposed method involves three main steps. Firstly, for each incomplete signal, the data before a gap and the data after this gap are considered as two separated reference time series with their respective query windowsQbandQa. We then find the most similar subsequence (Qbs) to the subsequence before this gapQband the most similar one (Qas) to the subsequence after the gapQa. To find these similar windows, we build a new similarity measure based on fuzzy grades of basic similarity measures and on fuzzy logic rules. Finally, we fill in the gap with average values of the window followingQbsand the one precedingQas. The experimental results have demonstrated that the proposed approach outperforms the state-of-the-art methods in case of multivariate time series having low/noncorrelated data but effective information on each signal.


Sign in / Sign up

Export Citation Format

Share Document