Segmentation of High Dimensional Time-Series Data Using Mixture of Sparse Principal Component Regression Model with Information Complexity

Yaojin Sun; Hamparsum Bozdogan

doi:10.3390/e22101170

Segmentation of High Dimensional Time-Series Data Using Mixture of Sparse Principal Component Regression Model with Information Complexity

Entropy ◽

10.3390/e22101170 ◽

2020 ◽

Vol 22 (10) ◽

pp. 1170

Author(s):

Yaojin Sun ◽

Hamparsum Bozdogan

Keyword(s):

Time Series ◽

Large Scale ◽

Time Series Data ◽

Fitness Function ◽

Explanatory Power ◽

Principal Component ◽

Predictor Variables ◽

Series Data ◽

High Dimensional ◽

Information Complexity

This paper presents a new and novel hybrid modeling method for the segmentation of high dimensional time-series data using the mixture of the sparse principal components regression (MIX-SPCR) model with information complexity (ICOMP) criterion as the fitness function. Our approach encompasses dimension reduction in high dimensional time-series data and, at the same time, determines the number of component clusters (i.e., number of segments across time-series data) and selects the best subset of predictors. A large-scale Monte Carlo simulation is performed to show the capability of the MIX-SPCR model to identify the correct structure of the time-series data successfully. MIX-SPCR model is also applied to a high dimensional Standard & Poor’s 500 (S&P 500) index data to uncover the time-series’s hidden structure and identify the structure change points. The approach presented in this paper determines both the relationships among the predictor variables and how various predictor variables contribute to the explanatory power of the response variable through the sparsity settings cluster wise.

Operational time-series data modeling via LSTM network integrating principal component analysis based on human experience

Journal of Manufacturing Systems ◽

10.1016/j.jmsy.2020.11.020 ◽

2020 ◽

Author(s):

Ke Yang ◽

Yi-liu Liu ◽

Yu-nan Yao ◽

Shi-dong Fan ◽

Ali Mosleh

Keyword(s):

Principal Component Analysis ◽

Time Series ◽

Time Series Data ◽

Data Modeling ◽

Principal Component ◽

Component Analysis ◽

Human Experience ◽

Series Data ◽

Lstm Network ◽

Operational Time

LSTM-Guided Coaching Assistant for Table Tennis Practice

Sensors ◽

10.3390/s18124112 ◽

2018 ◽

Vol 18 (12) ◽

pp. 4112 ◽

Cited By ~ 6

Author(s):

Se-Min Lim ◽

Hyeong-Cheol Oh ◽

Jaein Kim ◽

Juwon Lee ◽

Jooyoung Park

Keyword(s):

Time Series ◽

State Space ◽

Time Series Data ◽

State Space Model ◽

Skill Assessment ◽

Series Data ◽

High Dimensional ◽

Table Tennis ◽

Space Model ◽

Low Dimensional

Recently, wearable devices have become a prominent health care application domain by incorporating a growing number of sensors and adopting smart machine learning technologies. One closely related topic is the strategy of combining the wearable device technology with skill assessment, which can be used in wearable device apps for coaching and/or personal training. Particularly pertinent to skill assessment based on high-dimensional time series data from wearable sensors is classifying whether a player is an expert or a beginner, which skills the player is exercising, and extracting some low-dimensional representations useful for coaching. In this paper, we present a deep learning-based coaching assistant method, which can provide useful information in supporting table tennis practice. Our method uses a combination of LSTM (Long short-term memory) with a deep state space model and probabilistic inference. More precisely, we use the expressive power of LSTM when handling high-dimensional time series data, and state space model and probabilistic inference to extract low-dimensional latent representations useful for coaching. Experimental results show that our method can yield promising results for characterizing high-dimensional time series patterns and for providing useful information when working with wearable IMU (Inertial measurement unit) sensors for table tennis coaching.

Testing Serial Correlation and ARCH Effect of High-Dimensional Time-Series Data

Journal of Business and Economic Statistics ◽

10.1080/07350015.2019.1647844 ◽

2019 ◽

Vol 39 (1) ◽

pp. 136-147 ◽

Cited By ~ 1

Author(s):

Shiqing Ling ◽

Ruey S. Tsay ◽

Yaxing Yang

Keyword(s):

Time Series ◽

Serial Correlation ◽

Time Series Data ◽

Series Data ◽

High Dimensional ◽

Arch Effect

WATCH: Wasserstein Change Point Detection for High-Dimensional Time Series Data

10.1109/bigdata52589.2021.9671962 ◽

2021 ◽

Author(s):

Kamil Faber ◽

Roberto Corizzo ◽

Bartlomiej Sniezynski ◽

Michael Baron ◽

Nathalie Japkowicz

Keyword(s):

Time Series ◽

Change Point ◽

Time Series Data ◽

Change Point Detection ◽

Series Data ◽

High Dimensional ◽

Point Detection

Evidence Graphs: Supporting Transparent and FAIR Computation, with Defeasible Reasoning on Data, Methods and Results

10.1101/2021.03.29.437561 ◽

2021 ◽

Author(s):

Sadnan Al Manir ◽

Justin Niestroy ◽

Maxwell Adam Levinson ◽

Timothy Clark

Keyword(s):

Time Series ◽

Large Scale ◽

Time Series Data ◽

Predictive Analytics ◽

Defeasible Reasoning ◽

Series Data ◽

Inference Rules ◽

Deep Networks ◽

Evidence Graph ◽

Over Time

Introduction: Transparency of computation is a requirement for assessing the validity of computed results and research claims based upon them; and it is essential for access to, assessment, and reuse of computational components. These components may be subject to methodological or other challenges over time. While reference to archived software and/or data is increasingly common in publications, a single machine-interpretable, integrative representation of how results were derived, that supports defeasible reasoning, has been absent. Methods: We developed the Evidence Graph Ontology, EVI, in OWL 2, with a set of inference rules, to provide deep representations of supporting and challenging evidence for computations, services, software, data, and results, across arbitrarily deep networks of computations, in connected or fully distinct processes. EVI integrates FAIR practices on data and software, with important concepts from provenance models, and argumentation theory. It extends PROV for additional expressiveness, with support for defeasible reasoning. EVI treats any com- putational result or component of evidence as a defeasible assertion, supported by a DAG of the computations, software, data, and agents that produced it. Results: We have successfully deployed EVI for very-large-scale predictive analytics on clinical time-series data. Every result may reference its own evidence graph as metadata, which can be extended when subsequent computations are executed. Discussion: Evidence graphs support transparency and defeasible reasoning on results. They are first-class computational objects, and reference the datasets and software from which they are derived. They support fully transparent computation, with challenge and support propagation. The EVI approach may be extended to include instruments, animal models, and critical experimental reagents.

Differentially Private Autocorrelation Time-Series Data Publishing Based on Sliding Window

Security and Communication Networks ◽

10.1155/2021/6665984 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Jing Zhao ◽

Shubo Liu ◽

Xingxing Xiong ◽

Zhaohui Cai

Keyword(s):

Time Series ◽

Privacy Protection ◽

Large Scale ◽

Differential Privacy ◽

Time Series Data ◽

Sliding Window ◽

Data Publishing ◽

Series Data ◽

Data Publication ◽

Autocorrelation Time

Privacy protection is one of the major obstacles for data sharing. Time-series data have the characteristics of autocorrelation, continuity, and large scale. Current research on time-series data publication mainly ignores the correlation of time-series data and the lack of privacy protection. In this paper, we study the problem of correlated time-series data publication and propose a sliding window-based autocorrelation time-series data publication algorithm, called SW-ATS. Instead of using global sensitivity in the traditional differential privacy mechanisms, we proposed periodic sensitivity to provide a stronger degree of privacy guarantee. SW-ATS introduces a sliding window mechanism, with the correlation between the noise-adding sequence and the original time-series data guaranteed by sequence indistinguishability, to protect the privacy of the latest data. We prove that SW-ATS satisfies ε-differential privacy. Compared with the state-of-the-art algorithm, SW-ATS is superior in reducing the error rate of MAE which is about 25%, improving the utility of data, and providing stronger privacy protection.

Prediction of Graduation with Naïve Bayes Algorithm and Principal Component Analysis (PCA) on Time Series Data

10.1109/icoict52021.2021.9527443 ◽

2021 ◽

Author(s):

Wishnu Dwi Herlambang ◽

Kusuma Ayu Laksitowening ◽

Ibnu Asror

Keyword(s):

Principal Component Analysis ◽

Time Series ◽

Time Series Data ◽

Naive Bayes ◽

Principal Component ◽

Component Analysis ◽

Naïve Bayes ◽

Series Data ◽

Bayes Algorithm

How to Identify Varying Lead–Lag Effects in Time Series Data: Implementation, Validation, and Application of the Generalized Causality Algorithm

Algorithms ◽

10.3390/a13040095 ◽

2020 ◽

Vol 13 (4) ◽

pp. 95 ◽

Cited By ~ 1

Author(s):

Johannes Stübinger ◽

Katharina Adler

Keyword(s):

Time Series ◽

Large Scale ◽

Structural Breaks ◽

Time Series Data ◽

Consumer Price Index ◽

Real Data ◽

Linear Mapping ◽

Series Data ◽

Lag Effects ◽

Silver Metal

This paper develops the generalized causality algorithm and applies it to a multitude of data from the fields of economics and finance. Specifically, our parameter-free algorithm efficiently determines the optimal non-linear mapping and identifies varying lead–lag effects between two given time series. This procedure allows an elastic adjustment of the time axis to find similar but phase-shifted sequences—structural breaks in their relationship are also captured. A large-scale simulation study validates the outperformance in the vast majority of parameter constellations in terms of efficiency, robustness, and feasibility. Finally, the presented methodology is applied to real data from the areas of macroeconomics, finance, and metal. Highest similarity show the pairs of gross domestic product and consumer price index (macroeconomics), S&P 500 index and Deutscher Aktienindex (finance), as well as gold and silver (metal). In addition, the algorithm takes full use of its flexibility and identifies both various structural breaks and regime patterns over time, which are (partly) well documented in the literature.

Applying multiple time series data mining to large-scale network traffic analysis

2008 IEEE Conference on Cybernetics and Intelligent Systems ◽

10.1109/iccis.2008.4670844 ◽

2008 ◽

Cited By ~ 1

Author(s):

Weisong He ◽

Guangmin Hu ◽

Xingmiao Yao ◽

Guangyuan Kan ◽

Hong Wang ◽

...

Keyword(s):

Data Mining ◽

Time Series ◽

Large Scale ◽

Time Series Data ◽

Series Data ◽

Multiple Time ◽

Multiple Time Series ◽

Network Traffic Analysis ◽

Large Scale Network ◽

Scale Network

Clustering of large scale QoS time series data in federated clouds using improved variable Chromosome Length Genetic Algorithm (CQGA)

Expert Systems with Applications ◽

10.1016/j.eswa.2020.113840 ◽

2021 ◽

Vol 164 ◽

pp. 113840

Author(s):

Amin Keshavarzi ◽

Abolfazl Toroghi Haghighat ◽

Mahdi Bohlouli

Keyword(s):

Genetic Algorithm ◽

Time Series ◽

Large Scale ◽

Time Series Data ◽

Chromosome Length ◽

Series Data