A distributed real-time data prediction framework for large-scale time-series data using stream processing

Purpose The purpose of this paper is to propose a data prediction framework for scenarios which require forecasting demand for large-scale data sources, e.g., sensor networks, securities exchange, electric power secondary system, etc. Concretely, the proposed framework should handle several difficult requirements including the management of gigantic data sources, the need for a fast self-adaptive algorithm, the relatively accurate prediction of multiple time series, and the real-time demand. Design/methodology/approach First, the autoregressive integrated moving average-based prediction algorithm is introduced. Second, the processing framework is designed, which includes a time-series data storage model based on the HBase, and a real-time distributed prediction platform based on Storm. Then, the work principle of this platform is described. Finally, a proof-of-concept testbed is illustrated to verify the proposed framework. Findings Several tests based on Power Grid monitoring data are provided for the proposed framework. The experimental results indicate that prediction data are basically consistent with actual data, processing efficiency is relatively high, and resources consumption is reasonable. Originality/value This paper provides a distributed real-time data prediction framework for large-scale time-series data, which can exactly achieve the requirement of the effective management, prediction efficiency, accuracy, and high concurrency for massive data sources.

Download Full-text

Study on Fractal Multistep Forecast for the Prediction of Driving Behavior

Journal of Advanced Transportation ◽

10.1155/2020/9150583 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Longhai Yang ◽

Hong Xu ◽

Xiqiao Zhang ◽

Shuai Li ◽

Wenchao Ji

Keyword(s):

Prediction Model ◽

Real Time ◽

Time Series Data ◽

Back Propagation ◽

Series Data ◽

Vehicle Speed ◽

Trajectory Data ◽

Time Data ◽

Characteristic Analysis ◽

Real Time Data

The application and development of new technology make it possible to acquire real-time data of vehicles. Based on these real-time data, the behavior of vehicles can be analyzed. The prediction of vehicle behavior provides data support for the fine management of traffic. This paper proposes speed and acceleration have fractal features by R/S analysis of the time series data of speed and acceleration. Based on the characteristic analysis of microscopic parameters, the characteristic indexes of parameters are quantified, the fractal multistep prediction model of microparameters is established, and the BP (back propagation neural networks) model is established to estimate predictable step of fractal prediction model. The fractal multistep prediction model is used to predict speed acceleration in the predictable step. NGSIM trajectory data are used to test the multistep prediction model. The results show that the proposed fractal multistep prediction model can effectively realize the multistep prediction of vehicle speed.

Download Full-text

Peramalan Data IHSG Menggunakan Fuzzy Time Series

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.2155 ◽

2013 ◽

Vol 7 (1) ◽

Author(s):

Seng Hansun

Keyword(s):

Time Series ◽

Stock Market ◽

Time Series Analysis ◽

Real Time ◽

Soft Computing ◽

Time Series Data ◽

Fuzzy Time Series ◽

Series Data ◽

Time Data ◽

Series Analysis

AbstrakFuzzy time series merupakan salah satu metode soft computing yang telah digunakan dan diterapkan dalam analisis data runtun waktu. Tujuan utama dari fuzzy time series adalah untuk memprediksi data runtun waktu yang dapat digunakan secara luas pada sembarang data real time, termasuk data pasar modal.Banyak peneliti yang telah berkontribusi dalam pengembangan analisis data runtun waktu menggunakan fuzzy time series, seperti Chen dan Hsu [1], Jilani dkk. [2], serta Stevenson dan Porter [3]. Dalam penelitian ini, dicoba untuk menerapkan metode fuzzy time series pada salah satu indikator pergerakan harga saham, yakni data IHSG (Indeks Harga Saham Gabungan).Kinerja metode yang diusulkan dievaluasi dengan menghitung tingkat akurasi dan tingkat kehandalan metode fuzzy time series yang diterapkan pada data IHSG. Melalui pendekatan ini, diharapkan metode fuzzy time series dapat menjadi alternatif untuk memprediksi data IHSG yang merupakan salah satu indikator pergerakan harga saham di Indonesia. Kata kunci – fuzzy time series, data runtun waktu, soft computing, IHSG AbstractFuzzy time series is one of the soft computing method that been used and implemented in time series analysis. The main goal of fuzzy time series is to predict time series data that can be used widely in any real time data, including stock market share.Many researchers have contributed in the development of fuzzy time series analysis, such as Chen and Hsu [1], Jilani [2], and Stevenson and Porter [3]. In this research, we will try to implement the fuzzy time series method in one of the stock market change indicator, i.e. the Jakarta composite index or also known as IHSG (Indeks Harga Saham Gabungan).The research is continued by calculating the accuracy and robustness of the method which has been implemented on IHSG data. By this approach, we hope it can be an alternative to predict the IHSG data which is an indicator of stock price changes in Indonesia. Keywords – fuzzy time series, time series data, soft computing, IHSG

Download Full-text

Embedding-based real-time change point detection with application to activity segmentation in smart home time series data

Expert Systems with Applications ◽

10.1016/j.eswa.2021.115641 ◽

2021 ◽

pp. 115641

Author(s):

Unai Bermejo ◽

Aitor Almeida ◽

Aritz Bilbao ◽

Gorka Azkune

Keyword(s):

Time Series ◽

Real Time ◽

Change Point ◽

Smart Home ◽

Time Series Data ◽

Time Change ◽

Change Point Detection ◽

Series Data ◽

Point Detection

Download Full-text

Ensembles of Gradient Boosting Recurrent Neural Network for Time Series Data Prediction

IEEE Access ◽

10.1109/access.2021.3082519 ◽

2021 ◽

pp. 1-1

Author(s):

Shiqing Sang ◽

Fangfang Qu ◽

Pengcheng Nie

Keyword(s):

Neural Network ◽

Time Series ◽

Recurrent Neural Network ◽

Time Series Data ◽

Series Data ◽

Gradient Boosting ◽

Data Prediction

Download Full-text

Exploring the dynamic contribution behavior of editors in wikis based on time series analysis

Program electronic library and information systems ◽

10.1108/prog-06-2013-0034 ◽

2016 ◽

Vol 50 (1) ◽

pp. 41-57 ◽

Cited By ~ 1

Author(s):

Linghe Huang ◽

Qinghua Zhu ◽

Jia Tina Du ◽

Baozhen Lee

Keyword(s):

Time Series ◽

Time Series Analysis ◽

Time Series Data ◽

Time Of Day ◽

Observation Time ◽

Series Data ◽

Clustering Methods ◽

Content Type ◽

Free Rider Problem ◽

Series Analysis

Purpose – Wiki is a new form of information production and organization, which has become one of the most important knowledge resources. In recent years, with the increase of users in wikis, “free rider problem” has been serious. In order to motivate editors to contribute more to a wiki system, it is important to fully understand their contribution behavior. The purpose of this paper is to explore the law of dynamic contribution behavior of editors in wikis. Design/methodology/approach – After developing a dynamic model of contribution behavior, the authors employed both the metrological and clustering methods to process the time series data. The experimental data were collected from Baidu Baike, a renowned Chinese wiki system similar to Wikipedia. Findings – There are four categories of editors: “testers,” “dropouts,” “delayers” and “stickers.” Testers, who contribute the least content and stop contributing rapidly after editing a few articles. After editing a large amount of content, dropouts stop contributing completely. Delayers are the editors who do not stop contributing during the observation time, but they may stop contributing in the near future. Stickers, who keep contributing and edit the most content, are the core editors. In addition, there are significant time-of-day and holiday effects on the number of editors’ contributions. Originality/value – By using the method of time series analysis, some new characteristics of editors and editor types were found. Compared with the former studies, this research also had a larger sample. Therefore, the results are more scientific and representative and can help managers to better optimize the wiki systems and formulate incentive strategies for editors.

Download Full-text

Use of an ensemble Kalman filter for real-time inversion of leaf area index from MODIS time series data

2009 IEEE International Geoscience and Remote Sensing Symposium ◽

10.1109/igarss.2009.5417369 ◽

2009 ◽

Cited By ~ 2

Author(s):

Zhiqiang Xiao ◽

Shunlin Liang ◽

Jindi Wang ◽

Xiyan Wu

Keyword(s):

Time Series ◽

Kalman Filter ◽

Leaf Area Index ◽

Real Time ◽

Leaf Area ◽

Ensemble Kalman Filter ◽

Time Series Data ◽

Series Data ◽

Time Inversion ◽

Area Index

Download Full-text

On-line, real-time data sources

Collaborative Manufacturing ◽

10.1201/9781420025347-13 ◽

2002 ◽

pp. 129-152

Keyword(s):

Real Time ◽

Data Sources ◽

Time Data ◽

Real Time Data ◽

On Line

Download Full-text

Evidence Graphs: Supporting Transparent and FAIR Computation, with Defeasible Reasoning on Data, Methods and Results

10.1101/2021.03.29.437561 ◽

2021 ◽

Author(s):

Sadnan Al Manir ◽

Justin Niestroy ◽

Maxwell Adam Levinson ◽

Timothy Clark

Keyword(s):

Time Series ◽

Large Scale ◽

Time Series Data ◽

Predictive Analytics ◽

Defeasible Reasoning ◽

Series Data ◽

Inference Rules ◽

Deep Networks ◽

Evidence Graph ◽

Over Time

Introduction: Transparency of computation is a requirement for assessing the validity of computed results and research claims based upon them; and it is essential for access to, assessment, and reuse of computational components. These components may be subject to methodological or other challenges over time. While reference to archived software and/or data is increasingly common in publications, a single machine-interpretable, integrative representation of how results were derived, that supports defeasible reasoning, has been absent. Methods: We developed the Evidence Graph Ontology, EVI, in OWL 2, with a set of inference rules, to provide deep representations of supporting and challenging evidence for computations, services, software, data, and results, across arbitrarily deep networks of computations, in connected or fully distinct processes. EVI integrates FAIR practices on data and software, with important concepts from provenance models, and argumentation theory. It extends PROV for additional expressiveness, with support for defeasible reasoning. EVI treats any com- putational result or component of evidence as a defeasible assertion, supported by a DAG of the computations, software, data, and agents that produced it. Results: We have successfully deployed EVI for very-large-scale predictive analytics on clinical time-series data. Every result may reference its own evidence graph as metadata, which can be extended when subsequent computations are executed. Discussion: Evidence graphs support transparency and defeasible reasoning on results. They are first-class computational objects, and reference the datasets and software from which they are derived. They support fully transparent computation, with challenge and support propagation. The EVI approach may be extended to include instruments, animal models, and critical experimental reagents.

Download Full-text

Differentially Private Autocorrelation Time-Series Data Publishing Based on Sliding Window

Security and Communication Networks ◽

10.1155/2021/6665984 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Jing Zhao ◽

Shubo Liu ◽

Xingxing Xiong ◽

Zhaohui Cai

Keyword(s):

Time Series ◽

Privacy Protection ◽

Large Scale ◽

Differential Privacy ◽

Time Series Data ◽

Sliding Window ◽

Data Publishing ◽

Series Data ◽

Data Publication ◽

Autocorrelation Time

Privacy protection is one of the major obstacles for data sharing. Time-series data have the characteristics of autocorrelation, continuity, and large scale. Current research on time-series data publication mainly ignores the correlation of time-series data and the lack of privacy protection. In this paper, we study the problem of correlated time-series data publication and propose a sliding window-based autocorrelation time-series data publication algorithm, called SW-ATS. Instead of using global sensitivity in the traditional differential privacy mechanisms, we proposed periodic sensitivity to provide a stronger degree of privacy guarantee. SW-ATS introduces a sliding window mechanism, with the correlation between the noise-adding sequence and the original time-series data guaranteed by sequence indistinguishability, to protect the privacy of the latest data. We prove that SW-ATS satisfies ε-differential privacy. Compared with the state-of-the-art algorithm, SW-ATS is superior in reducing the error rate of MAE which is about 25%, improving the utility of data, and providing stronger privacy protection.

Download Full-text