Case Studies in Multi-unit LongitudinalModels with Random Coefficients and Patterned Correlation Structure

Data driven decision making is becoming increasingly an important aspect for successful business execution. More and more organizations are moving towards taking informed decisions based on the data that they are generating. Most of this data are in temporal format - time series data. Effective analysis across time series data sets, in an efficient and quick manner is a challenge. The most interesting and valuable part of such analysis is to generate insights on correlation and causation across multiple time series data sets. This paper looks at methods that can be used to analyze such data sets and gain useful insights from it, primarily in the form of correlation and causation analysis. This paper focuses on two methods for doing so, Two Sample Test with Dynamic Time Warping and Hierarchical Clustering and looks at how the results returned from both can be used to gain a better understanding of the data. Moreover, the methods used are meant to work with any data set, regardless of the subject domain and idiosyncrasies of the data set, primarily, a data agnostic approach.

Download Full-text

Subsequence Time Series Clustering

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch286 ◽

2011 ◽

pp. 1871-1876

Author(s):

Jason Chen

Keyword(s):

Time Series ◽

Traditional Method ◽

Time Series Data ◽

Large Data ◽

Series Data ◽

Data Sets ◽

Data Set ◽

Time Series Clustering ◽

Mining Community ◽

Representative Points

Clustering analysis is a tool used widely in the Data Mining community and beyond (Everitt et al. 2001). In essence, the method allows us to “summarise” the information in a large data set X by creating a very much smaller set C of representative points (called centroids) and a membership map relating each point in X to its representative in C. An obvious but special type of data set that one might want to cluster is a time series data set. Such data has a temporal ordering on its elements, in contrast to non-time series data sets. In this article we explore the area of time series clustering, focusing mainly on a surprising recent result showing that the traditional method for time series clustering is meaningless. We then survey the literature of recent papers and go on to argue how time series clustering can be made meaningful.

Download Full-text

A Review on Anomaly Detection in Time Series

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/571032021 ◽

2021 ◽

Vol 10 (3) ◽

pp. 1895-1900

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Monitoring Network ◽

Series Data ◽

Data Sets ◽

Data Set ◽

The Past ◽

Machine Health Monitoring ◽

Machine Health

Time series is a very common class of data sets. Among others, it is very simple to obtain time series data from a variety of various science and finance applications and an anomaly detection technique for time series is becoming a very prominent research topic nowadays. Anomaly identification covers intrusion detection, detection of theft, mistake detection, machine health monitoring, network sensor event detection or habitat disturbance. It is also used for removing suspicious data from the data set before production. This review aims to provide a detailed and organized overview of the Anomaly detection investigation. In this article we will first define what an anomaly in time series is, and then describe quickly some of the methods suggested in the past two or three years for detection of anomaly in time series

Download Full-text

The CH-IRP data set: a decade of fortnightly data on δ2H and δ18O in streamflow and precipitation in Switzerland

Earth System Science Data ◽

10.5194/essd-12-3057-2020 ◽

2020 ◽

Vol 12 (4) ◽

pp. 3057-3066

Author(s):

Maria Staudinger ◽

Stefan Seeger ◽

Barbara Herbstritt ◽

Michael Stoelzle ◽

Jan Seibert ◽

...

Keyword(s):

Time Series ◽

Time Series Data ◽

Isotope Composition ◽

Series Data ◽

Data Repository ◽

Data Sets ◽

Individual Study ◽

Data Set ◽

Flow Pathways

Abstract. The stable isotopes of oxygen and hydrogen, 18O and 2H, provide information on water flow pathways and hydrologic catchment functioning. Here a data set of time series data on precipitation and streamflow isotope composition in medium-sized Swiss catchments, CH-IRP, is presented that is unique in terms of its long-term multi-catchment coverage along an alpine to pre-alpine gradient. The data set comprises fortnightly time series of both δ2H and δ18O as well as deuterium excess from streamflow for 23 sites in Switzerland, together with summary statistics of the sampling at each station. Furthermore, time series of δ18O and δ2H in precipitation are provided for each catchment derived from interpolated data sets from the ISOT, GNIP and ANIP networks. For each station we compiled relevant metadata describing both the sampling conditions and catchment characteristics and climate information. Lab standards and errors are provided, and potentially problematic measurements are indicated to help the user decide on the applicability for individual study purposes. For the future, the measurements are planned to be continued at 14 stations as a long-term isotopic measurement network, and the CH-IRP data set will, thus, continuously be extended. The data set can be downloaded from data repository Zenodo at https://doi.org/10.5281/zenodo.4057967 (Staudinger et al., 2020).

Download Full-text

Time series event correlation with DTW and Hierarchical Clustering methods

10.7287/peerj.preprints.27959v1 ◽

2019 ◽

Cited By ~ 1

Author(s):

Srishti Mishra ◽

Zohair Shafi ◽

Santanu Pathak

Keyword(s):

Time Series ◽

Hierarchical Clustering ◽

Time Series Data ◽

Series Data ◽

Data Sets ◽

Multiple Time ◽

Clustering Methods ◽

Event Correlation ◽

Data Set ◽

Causation Analysis

Data driven decision making is becoming increasingly an important aspect for successful business execution. More and more organizations are moving towards taking informed decisions based on the data that they are generating. Most of this data are in temporal format - time series data. Effective analysis across time series data sets, in an efficient and quick manner is a challenge. The most interesting and valuable part of such analysis is to generate insights on correlation and causation across multiple time series data sets. This paper looks at methods that can be used to analyze such data sets and gain useful insights from it, primarily in the form of correlation and causation analysis. This paper focuses on two methods for doing so, Two Sample Test with Dynamic Time Warping and Hierarchical Clustering and looks at how the results returned from both can be used to gain a better understanding of the data. Moreover, the methods used are meant to work with any data set, regardless of the subject domain and idiosyncrasies of the data set, primarily, a data agnostic approach.

Download Full-text

Remaining Useful Life Prediction Using Temporal Convolution with Attention

AI ◽

10.3390/ai2010005 ◽

2021 ◽

Vol 2 (1) ◽

pp. 48-70

Author(s):

Wei Ming Tan ◽

T. Hui Teo

Keyword(s):

Neural Network ◽

Time Series ◽

Time Series Data ◽

Remaining Useful Life ◽

Sensor Data ◽

Series Data ◽

Multiple Time ◽

Data Set ◽

Form Complex ◽

Useful Life

Prognostic techniques attempt to predict the Remaining Useful Life (RUL) of a subsystem or a component. Such techniques often use sensor data which are periodically measured and recorded into a time series data set. Such multivariate data sets form complex and non-linear inter-dependencies through recorded time steps and between sensors. Many current existing algorithms for prognostic purposes starts to explore Deep Neural Network (DNN) and its effectiveness in the field. Although Deep Learning (DL) techniques outperform the traditional prognostic algorithms, the networks are generally complex to deploy or train. This paper proposes a Multi-variable Time Series (MTS) focused approach to prognostics that implements a lightweight Convolutional Neural Network (CNN) with attention mechanism. The convolution filters work to extract the abstract temporal patterns from the multiple time series, while the attention mechanisms review the information across the time axis and select the relevant information. The results suggest that the proposed method not only produces a superior accuracy of RUL estimation but it also trains many folds faster than the reported works. The superiority of deploying the network is also demonstrated on a lightweight hardware platform by not just being much compact, but also more efficient for the resource restricted environment.

Download Full-text

Studying monthly rainfall over Dibrugarh, Assam: Use of SARIMA approach

MAUSAM ◽

10.54302/mausam.v68i2.637 ◽

2021 ◽

Vol 68 (2) ◽

pp. 349-356

Author(s):

J. HAZARIKA ◽

B. PATHAK ◽

A. N. PATOWARY

Keyword(s):

Time Series ◽

Time Series Data ◽

Moving Average ◽

Demand Management ◽

Arima Model ◽

Monthly Rainfall ◽

Series Data ◽

Data Set ◽

Modeling And Forecasting ◽

Moving Average Model

Perceptive the rainfall pattern is tough for the solution of several regional environmental issues of water resources management, with implications for agriculture, climate change, and natural calamity such as floods and droughts. Statistical computing, modeling and forecasting data are key instruments for studying these patterns. The study of time series analysis and forecasting has become a major tool in different applications in hydrology and environmental fields. Among the most effective approaches for analyzing time series data is the ARIMA (Autoregressive Integrated Moving Average) model introduced by Box and Jenkins. In this study, an attempt has been made to use Box-Jenkins methodology to build ARIMA model for monthly rainfall data taken from Dibrugarh for the period of 1980- 2014 with a total of 420 points. We investigated and found that ARIMA (0, 0, 0) (0, 1, 1)12 model is suitable for the given data set. As such this model can be used to forecast the pattern of monthly rainfall for the upcoming years, which can help the decision makers to establish priorities in terms of agricultural, flood, water demand management etc.

Download Full-text

Cell cycle time series gene expression data encoded as cyclic attractors in Hopfield systems

10.1101/170027 ◽

2017 ◽

Author(s):

Anthony Szedlak ◽

Spencer Sims ◽

Nicholas Smith ◽

Giovanni Paternostro ◽

Carlo Piermarocchi

Keyword(s):

Neural Network ◽

Gene Expression ◽

Cell Cycle ◽

Time Series ◽

Time Series Data ◽

Series Data ◽

Data Sets ◽

Expression Data ◽

Time Series Gene Expression ◽

Human Cervical Cancer

AbstractModern time series gene expression and other omics data sets have enabled unprecedented resolution of the dynamics of cellular processes such as cell cycle and response to pharmaceutical compounds. In anticipation of the proliferation of time series data sets in the near future, we use the Hopfield model, a recurrent neural network based on spin glasses, to model the dynamics of cell cycle in HeLa (human cervical cancer) and S. cerevisiae cells. We study some of the rich dynamical properties of these cyclic Hopfield systems, including the ability of populations of simulated cells to recreate experimental expression data and the effects of noise on the dynamics. Next, we use a genetic algorithm to identify sets of genes which, when selectively inhibited by local external fields representing gene silencing compounds such as kinase inhibitors, disrupt the encoded cell cycle. We find, for example, that inhibiting the set of four kinases BRD4, MAPK1, NEK7, and YES1 in HeLa cells causes simulated cells to accumulate in the M phase. Finally, we suggest possible improvements and extensions to our model.Author SummaryCell cycle – the process in which a parent cell replicates its DNA and divides into two daughter cells – is an upregulated process in many forms of cancer. Identifying gene inhibition targets to regulate cell cycle is important to the development of effective therapies. Although modern high throughput techniques offer unprecedented resolution of the molecular details of biological processes like cell cycle, analyzing the vast quantities of the resulting experimental data and extracting actionable information remains a formidable task. Here, we create a dynamical model of the process of cell cycle using the Hopfield model (a type of recurrent neural network) and gene expression data from human cervical cancer cells and yeast cells. We find that the model recreates the oscillations observed in experimental data. Tuning the level of noise (representing the inherent randomness in gene expression and regulation) to the “edge of chaos” is crucial for the proper behavior of the system. We then use this model to identify potential gene targets for disrupting the process of cell cycle. This method could be applied to other time series data sets and used to predict the effects of untested targeted perturbations.

Download Full-text