scholarly journals Normalized Multivariate Time Series Causality Analysis and Causal Graph Reconstruction

Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 679
Author(s):  
X. San Liang

Causality analysis is an important problem lying at the heart of science, and is of particular importance in data science and machine learning. An endeavor during the past 16 years viewing causality as a real physical notion so as to formulate it from first principles, however, seems to have gone unnoticed. This study introduces to the community this line of work, with a long-due generalization of the information flow-based bivariate time series causal inference to multivariate series, based on the recent advance in theoretical development. The resulting formula is transparent, and can be implemented as a computationally very efficient algorithm for application. It can be normalized and tested for statistical significance. Different from the previous work along this line where only information flows are estimated, here an algorithm is also implemented to quantify the influence of a unit to itself. While this forms a challenge in some causal inferences, here it comes naturally, and hence the identification of self-loops in a causal graph is fulfilled automatically as the causalities along edges are inferred. To demonstrate the power of the approach, presented here are two applications in extreme situations. The first is a network of multivariate processes buried in heavy noises (with the noise-to-signal ratio exceeding 100), and the second a network with nearly synchronized chaotic oscillators. In both graphs, confounding processes exist. While it seems to be a challenge to reconstruct from given series these causal graphs, an easy application of the algorithm immediately reveals the desideratum. Particularly, the confounding processes have been accurately differentiated. Considering the surge of interest in the community, this study is very timely.

2017 ◽  
Vol 10 (5) ◽  
pp. 1945-1960 ◽  
Author(s):  
Christina Papagiannopoulou ◽  
Diego G. Miralles ◽  
Stijn Decubber ◽  
Matthias Demuzere ◽  
Niko E. C. Verhoest ◽  
...  

Abstract. Satellite Earth observation has led to the creation of global climate data records of many important environmental and climatic variables. These come in the form of multivariate time series with different spatial and temporal resolutions. Data of this kind provide new means to further unravel the influence of climate on vegetation dynamics. However, as advocated in this article, commonly used statistical methods are often too simplistic to represent complex climate–vegetation relationships due to linearity assumptions. Therefore, as an extension of linear Granger-causality analysis, we present a novel non-linear framework consisting of several components, such as data collection from various databases, time series decomposition techniques, feature construction methods, and predictive modelling by means of random forests. Experimental results on global data sets indicate that, with this framework, it is possible to detect non-linear patterns that are much less visible with traditional Granger-causality methods. In addition, we discuss extensive experimental results that highlight the importance of considering non-linear aspects of climate–vegetation dynamics.


2021 ◽  
Vol 15 ◽  
Author(s):  
Jolan Heyse ◽  
Laurent Sheybani ◽  
Serge Vulliémoz ◽  
Pieter van Mierlo

The detection of causal effects among simultaneous observations provides knowledge about the underlying network, and is a topic of interests in many scientific areas. Over the years different causality measures have been developed, each with their own advantages and disadvantages. However, an extensive evaluation study is missing. In this work we consider some of the best-known causality measures i.e., cross-correlation, (conditional) Granger causality index (CGCI), partial directed coherence (PDC), directed transfer function (DTF), and partial mutual information on mixed embedding (PMIME). To correct for noise-related spurious connections, each measure (except PMIME) is tested for statistical significance based on surrogate data. The performance of the causality metrics is evaluated on a set of simulation models with distinct characteristics, to assess how well they work in- as well as outside of their “comfort zone.” PDC and DTF perform best on systems with frequency-specific connections, while PMIME is the only one able to detect non-linear interactions. The varying performance depending on the system characteristics warrants the use of multiple measures and comparing their results to avoid errors. Furthermore, lags between coupled variables are inherent to real-world systems and could hold essential information on the network dynamics. They are however often not taken into account and we lack proper tools to estimate them. We propose three new methods for lag estimation in multivariate time series, based on autoregressive modelling and information theory. One of the autoregressive methods and the one based on information theory were able to reliably identify the correct lag value in different simulated systems. However, only the latter was able to maintain its performance in the case of non-linear interactions. As a clinical application, the same methods are also applied on an intracranial recording of an epileptic seizure. The combined knowledge from the causality measures and insights from the simulations, on how these measures perform under different circumstances and when to use which one, allow us to recreate a plausible network of the seizure propagation that supports previous observations of desynchronisation and synchronisation during seizure progression. The lag estimation results show absence of a relationship between connectivity strength and estimated lag values, which contradicts the line of thinking in connectivity shaped by the neuron doctrine.


2016 ◽  
Author(s):  
Christina Papagiannopoulou ◽  
Diego G. Miralles ◽  
Niko E. C. Verhoest ◽  
Wouter A. Dorigo ◽  
Willem Waegeman

Abstract. Satellite Earth observation has led to the creation of global climate data records of many important environmental and climatic variables. These take the form of multivariate time series with different spatial and temporal resolutions. Data of this kind provide new means to unravel the influence of climate on vegetation dynamics. However, as advocated in this article, existing statistical methods are often too simplistic to represent complex climate–vegetation relationships due to the assumption of linearity of these relationships. Therefore, as an extension of linear Granger causality analysis, we present a novel non-linear framework consisting of several components, such as data collection from various databases, time series decomposition techniques, feature construction methods and predictive modelling by means of random forests. Experimental results on global data sets indicate that with this framework it is possible to detect non-linear patterns that are much less visible with traditional Granger causality methods. In addition, we also discuss extensive experimental results that highlight the importance of considering the non-linear aspect of climate–vegetation dynamics.


Author(s):  
Michael Poli ◽  
Jinkyoo Park ◽  
Ilija Ilievski

Finance is a particularly challenging application area for deep learning models due to low noise-to-signal ratio, non-stationarity, and partial observability. Non-deliverable-forwards (NDF), a derivatives contract used in foreign exchange (FX) trading, presents additional difficulty in the form of long-term planning required for an effective selection of start and end date of the contract. In this work, we focus on tackling the problem of NDF position length selection by leveraging high-dimensional sequential data consisting of spot rates, technical indicators and expert tenor patterns. To this end, we curate, analyze and release a dataset from the Depository Trust & Clearing Corporation (DTCC) NDF data that includes a comprehensive list of NDF volumes and daily spot rates for 64 FX pairs. We introduce WaveATTentionNet (WATTNet), a novel temporal convolution (TCN) model for spatio-temporal modeling of highly multivariate time series, and validate it across NDF markets with varying degrees of dissimilarity between the training and test periods in terms of volatility and general market regimes. The proposed method achieves a significant positive return on investment (ROI) in all NDF markets under analysis, outperforming recurrent and classical baselines by a wide margin. Finally, we propose two orthogonal interpretability approaches to verify noise robustness and detect the driving factors of the learned tenor selection strategy.


Sign in / Sign up

Export Citation Format

Share Document