scholarly journals Constructing the Microbial Association Network from Large-Scale Time Series Data Using Granger Causality

Genes ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 216 ◽  
Author(s):  
Dongmei Ai ◽  
Xiaoxin Li ◽  
Gang Liu ◽  
Xiaoyi Liang ◽  
Li Xia

The increasing availability of large-scale time series data allows the inference of microbial community dynamics by association network analysis. However, correlation-based association network analyses are noninformative of causal, mediating and time-dependent relationships between microbial community functional factors. To address this insufficiency, we introduced the Granger causality model to the analysis of a recent marine microbial time series dataset. We systematically constructed a directed acyclic network, representing both internal and external causal relationships among the microbial and environmental factors. We further optimized the network by removing false causal associations using the conditional Granger causality. The final network was visualized as a Granger graph, which was analyzed to identify causal relationships driven by key functional operators in the environment, such as Gammaproteobacteria, which was Granger caused by total organic nitrogen and primary production (p < 0.05 and Q < 0.05).

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Axel Wismüller ◽  
Adora M. Dsouza ◽  
M. Ali Vosoughi ◽  
Anas Abidin

AbstractA key challenge to gaining insight into complex systems is inferring nonlinear causal directional relations from observational time-series data. Specifically, estimating causal relationships between interacting components in large systems with only short recordings over few temporal observations remains an important, yet unresolved problem. Here, we introduce large-scale nonlinear Granger causality (lsNGC) which facilitates conditional Granger causality between two multivariate time series conditioned on a large number of confounding time series with a small number of observations. By modeling interactions with nonlinear state-space transformations from limited observational data, lsNGC identifies casual relations with no explicit a priori assumptions on functional interdependence between component time series in a computationally efficient manner. Additionally, our method provides a mathematical formulation revealing statistical significance of inferred causal relations. We extensively study the ability of lsNGC in inferring directed relations from two-node to thirty-four node chaotic time-series systems. Our results suggest that lsNGC captures meaningful interactions from limited observational data, where it performs favorably when compared to traditionally used methods. Finally, we demonstrate the applicability of lsNGC to estimating causality in large, real-world systems by inferring directional nonlinear, causal relationships among a large number of relatively short time series acquired from functional Magnetic Resonance Imaging (fMRI) data of the human brain.


2014 ◽  
Vol 23 (2) ◽  
pp. 213-229 ◽  
Author(s):  
Cangqi Zhou ◽  
Qianchuan Zhao

AbstractMining time series data is of great significance in various areas. To efficiently find representative patterns in these data, this article focuses on the definition of a valid dissimilarity measure and the acceleration of partitioning clustering, a common group of techniques used to discover typical shapes of time series. Dissimilarity measure is a crucial component in clustering. It is required, by some particular applications, to be invariant to specific transformations. The rationale for using the angle between two time series to define a dissimilarity is analyzed. Moreover, our proposed measure satisfies the triangle inequality with specific restrictions. This property can be employed to accelerate clustering. An integrated algorithm is proposed. The experiments show that angle-based dissimilarity captures the essence of time series patterns that are invariant to amplitude scaling. In addition, the accelerated algorithm outperforms the standard one as redundancies are pruned. Our approach has been applied to discover typical patterns of information diffusion in an online social network. Analyses revealed the formation mechanisms of different patterns.


2018 ◽  
Vol 15 (147) ◽  
pp. 20180695 ◽  
Author(s):  
Simone Cenci ◽  
Serguei Saavedra

Biotic interactions are expected to play a major role in shaping the dynamics of ecological systems. Yet, quantifying the effects of biotic interactions has been challenging due to a lack of appropriate methods to extract accurate measurements of interaction parameters from experimental data. One of the main limitations of existing methods is that the parameters inferred from noisy, sparsely sampled, nonlinear data are seldom uniquely identifiable. That is, many different parameters can be compatible with the same dataset and can generalize to independent data equally well. Hence, it is difficult to justify conclusive assertions about the effect of biotic interactions without information about their associated uncertainty. Here, we develop an ensemble method based on model averaging to quantify the uncertainty associated with the effect of biotic interactions on community dynamics from non-equilibrium ecological time-series data. Our method is able to detect the most informative time intervals for each biotic interaction within a multivariate time series and can be easily adapted to different regression schemes. Overall, this novel approach can be used to associate a time-dependent uncertainty with the effect of biotic interactions. Moreover, because we quantify uncertainty with minimal assumptions about the data-generating process, our approach can be applied to any data for which interactions among variables strongly affect the overall dynamics of the system.


2021 ◽  
Author(s):  
Sadnan Al Manir ◽  
Justin Niestroy ◽  
Maxwell Adam Levinson ◽  
Timothy Clark

Introduction: Transparency of computation is a requirement for assessing the validity of computed results and research claims based upon them; and it is essential for access to, assessment, and reuse of computational components. These components may be subject to methodological or other challenges over time. While reference to archived software and/or data is increasingly common in publications, a single machine-interpretable, integrative representation of how results were derived, that supports defeasible reasoning, has been absent. Methods: We developed the Evidence Graph Ontology, EVI, in OWL 2, with a set of inference rules, to provide deep representations of supporting and challenging evidence for computations, services, software, data, and results, across arbitrarily deep networks of computations, in connected or fully distinct processes. EVI integrates FAIR practices on data and software, with important concepts from provenance models, and argumentation theory. It extends PROV for additional expressiveness, with support for defeasible reasoning. EVI treats any com- putational result or component of evidence as a defeasible assertion, supported by a DAG of the computations, software, data, and agents that produced it. Results: We have successfully deployed EVI for very-large-scale predictive analytics on clinical time-series data. Every result may reference its own evidence graph as metadata, which can be extended when subsequent computations are executed. Discussion: Evidence graphs support transparency and defeasible reasoning on results. They are first-class computational objects, and reference the datasets and software from which they are derived. They support fully transparent computation, with challenge and support propagation. The EVI approach may be extended to include instruments, animal models, and critical experimental reagents.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Jing Zhao ◽  
Shubo Liu ◽  
Xingxing Xiong ◽  
Zhaohui Cai

Privacy protection is one of the major obstacles for data sharing. Time-series data have the characteristics of autocorrelation, continuity, and large scale. Current research on time-series data publication mainly ignores the correlation of time-series data and the lack of privacy protection. In this paper, we study the problem of correlated time-series data publication and propose a sliding window-based autocorrelation time-series data publication algorithm, called SW-ATS. Instead of using global sensitivity in the traditional differential privacy mechanisms, we proposed periodic sensitivity to provide a stronger degree of privacy guarantee. SW-ATS introduces a sliding window mechanism, with the correlation between the noise-adding sequence and the original time-series data guaranteed by sequence indistinguishability, to protect the privacy of the latest data. We prove that SW-ATS satisfies ε-differential privacy. Compared with the state-of-the-art algorithm, SW-ATS is superior in reducing the error rate of MAE which is about 25%, improving the utility of data, and providing stronger privacy protection.


2021 ◽  
Vol 1 (1) ◽  
pp. 93-105
Author(s):  
Zainal Zawir Simon ◽  
Effendy Zain ◽  
Zulihar Zulihar

Abstrak Penelitian ini bertujuan untuk mengetahui hubungan kausalitas antara harga jual apartemen dan harga sewa apartemen di wilayah Jabodetabek. Data yang dipergunakan adalah data  time series dalam bentuk kuartalan untuk periode 2007:1-2018:3 dan alat analisis yang dipergunakan adalah analisa kausalitas Granger. Hasil penelitian menunjukkan bahwa tidak terdapat hubungan kausalitas antara harga jual apartemen dan harga sewa apartemen di wilayah Jabodetabek. Dengan kata lain perubahan harga jual  tidak mempengaruhi harga sewa. Sebaliknya harga sewa juga tidak mempengaruhi harga jual apartemen. Dengan demikian Investor diharapkan dalam melakukan analisis investasinya memasukkan faktor-faktor lain yang dapat mempengaruhi harga jual dan harga sewa untuk apartemen, agar terlepas dari pandangan bahwa harga jual mempengaruhi harga sewa dan sebaliknya.Kata Kunci : Harga Jual apartemen, Harga Sewa Apartemen, Data Runtut Waktu, Analisa Kausalitas GrangerABSTRACTThis study aims to determine the causality relationship between the selling price of apartments and apartment rental prices in the Greater Jakarta area. The data used are time series data in quarterly form for the period 2007: 1-2018: 3 and the analysis tool used is the Granger causality analysis. The results showed that there was no causality relationship between apartment selling prices and apartment rental prices in the Greater Jakarta area. In other words, changes in selling prices do not affect rental prices. Conversely the rental price also does not affect the selling price of the apartment. Thus Investors are expected to carry out investment analysis to include other factors that can affect the selling price and rental price for an apartment, so that regardless of the view that the selling price affects the rental price and vice versa.Keywords : Selling Price of apartments, rental prices apartments, time series data, Granger Causality Analysis


mSystems ◽  
2020 ◽  
Vol 5 (4) ◽  
Author(s):  
Hsiao-Pei Lu ◽  
Yung-Hsien Shao ◽  
Jer-Horng Wu ◽  
Chih-hao Hsieh

ABSTRACT Performance of a bioreactor is affected by complex microbial consortia that regulate system functional processes. Studies so far, however, have mainly emphasized the selective pressures imposed by operational conditions (i.e., deterministic external physicochemical variables) on the microbial community as well as system performance, but have overlooked direct effects of the microbial community on system functioning. Here, using a bioreactor with ammonium as the sole substrate under controlled operational settings as a model system, we investigated succession of the bacterial community after a disturbance and its impact on nitrification and anammox (anaerobic ammonium oxidation) processes with fine-resolution time series data. System performance was quantified as the ratio of the fed ammonium converted to anammox-derived nitrogen gas (N2) versus nitrification-derived nitrate (npNO3−). After the disturbance, the N2/npNO3− ratio first decreased, then recovered, and finally stabilized until the end. Importantly, the dynamics of N2/npNO3− could not be fully explained by physicochemical variables of the system. In comparison, the proportion of variation that could be explained substantially increased (tripled) when the changes in bacterial composition were taken into account. Specifically, distinct bacterial taxa tended to dominate at different successional stages, and their relative abundances could explain up to 46% of the variation in nitrogen removal efficiency. These findings add baseline knowledge of microbial succession and emphasize the importance of monitoring the dynamics of microbial consortia for understanding the variability of system performance. IMPORTANCE Dynamics of microbial communities are believed to be associated with system functional processes in bioreactors. However, few studies have provided quantitative evidence. The difficulty of evaluating direct microbe-system relationships arises from the fact that system performance is affected by convolved effects of microbiota and bioreactor operational parameters (i.e., deterministic external physicochemical forcing). Here, using fine-resolution time series data (daily sampling for 2 months) under controlled operational settings, we performed an in-depth analysis of system performance as a function of the microbial community in the context of bioreactor physicochemical conditions. We obtained statistically evaluated results supporting the idea that monitoring microbial community dynamics could improve the ability to predict system functioning, beyond what could be explained by operational physicochemical variables. Moreover, our results suggested that considering the succession of multiple bacterial taxa would account for more system variation than focusing on any particular taxon, highlighting the need to integrate microbial community ecology for understanding system functioning.


Algorithms ◽  
2020 ◽  
Vol 13 (4) ◽  
pp. 95 ◽  
Author(s):  
Johannes Stübinger ◽  
Katharina Adler

This paper develops the generalized causality algorithm and applies it to a multitude of data from the fields of economics and finance. Specifically, our parameter-free algorithm efficiently determines the optimal non-linear mapping and identifies varying lead–lag effects between two given time series. This procedure allows an elastic adjustment of the time axis to find similar but phase-shifted sequences—structural breaks in their relationship are also captured. A large-scale simulation study validates the outperformance in the vast majority of parameter constellations in terms of efficiency, robustness, and feasibility. Finally, the presented methodology is applied to real data from the areas of macroeconomics, finance, and metal. Highest similarity show the pairs of gross domestic product and consumer price index (macroeconomics), S&P 500 index and Deutscher Aktienindex (finance), as well as gold and silver (metal). In addition, the algorithm takes full use of its flexibility and identifies both various structural breaks and regime patterns over time, which are (partly) well documented in the literature.


Sign in / Sign up

Export Citation Format

Share Document