Exploratory Time Series Data Mining by Genetic Clustering

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch055 ◽

2008 ◽

pp. 942-962

Author(s):

T. Warren Liao

Keyword(s):

Data Mining ◽

Time Series ◽

Time Series Data ◽

Distance Measures ◽

Series Data ◽

Synthetic Control ◽

Data Set ◽

Univariate Time Series ◽

Genetic Clustering ◽

Data Objects

In this chapter, we present genetic algorithm (GA) based methods developed for clustering univariate time series with equal or unequal length as an exploratory step of data mining. These methods basically implement the k-medoids algorithm. Each chromosome encodes in binary the data objects serving as the k-medoids. To compare their performance, both fixed-parameter and adaptive GAs were used. We first employed the synthetic control chart data set to investigate the performance of three fitness functions, two distance measures, and other GA parameters such as population size, crossover rate, and mutation rate. Two more sets of time series with or without known number of clusters were also experimented: one is the cylinder-bell-funnel data and the other is the novel battle simulation data. The clustering results are presented and discussed.

Download Full-text

Time Series Data Mining

International Journal of Business Analytics ◽

10.4018/ijban.2014100104 ◽

2014 ◽

Vol 1 (4) ◽

pp. 51-68 ◽

Cited By ~ 6

Author(s):

Daniel Hebert ◽

Billie Anderson ◽

Alan Olinsky ◽

J. Michael Hardin

Keyword(s):

Data Mining ◽

Time Series ◽

Time Series Data ◽

Grocery Store ◽

Series Data ◽

Time Interval ◽

Real World Data ◽

Similar Time ◽

Data Set ◽

Time Series Data Mining

Modern technologies have allowed for the amassment of data at a rate never encountered before. Organizations are now able to routinely collect and process massive volumes of data. A plethora of regularly collected information can be ordered using an appropriate time interval. The data would thus be developed into a time series. Time series data mining methodology identifies commonalities between sets of time-ordered data. Time series data mining detects similar time series using a technique known as dynamic time warping (DTW). This research provides a practical application of time series data mining. A real-world data set was provided to the authors by dunnhumby. A time series data mining analysis is performed using retail grocery store chain data and results are provided.

Download Full-text

Implementation of Parallel Algorithm Technology for Time Series Data Mining

Journal of Physics Conference Series ◽

10.1088/1742-6596/2066/1/012043 ◽

2021 ◽

Vol 2066 (1) ◽

pp. 012043

Author(s):

Wei Wang ◽

Xiaohui Hu ◽

Mingye Wang ◽

Yao Du

Keyword(s):

Data Mining ◽

Time Series ◽

Data Analysis ◽

Parallel Algorithm ◽

Time Series Data ◽

Internet Technology ◽

Series Data ◽

Data Set ◽

Time Series Data Mining ◽

Analysis Tools

Abstract With the rapid development of computer technology, Internet technology and artificial intelligence technology, the amount of global data has exploded. However, the single-machine serial mode of traditional data mining cannot be directly transplanted to the cloud platform. Only by parallelizing and improving many classic data mining algorithms can the cloud computing platform and data mining be effectively combined. Therefore, it is of great significance to the research and implementation of parallel algorithm technology for time series data mining. The purpose of this paper is to study the research and implementation of parallel algorithm technology for time series data mining. This paper adopts the method of literature data, mathematical statistics, logic analysis and other research methods to study the parallel algorithm technology research and realization of time series data mining, mainly to make useful explorations of time series data mining and visualization technology. It embodies the design ideas of big data analysis tools, and finally reflects the power and market value of data analysis tools through the display of the platform. Research shows that running in the same data set and the same experimental environment, the improved parallel collaborative filtering algorithm ACF in this paper has higher time running efficiency than the parallel algorithm MCF based on the cooccurrence matrix, and in the case of larger data sets, the more obvious the time difference.

Download Full-text

Mining Frequent Sequential Patterns

Journal of Big Data Research ◽

10.14302/issn.2768-0207.jbr-21-3455 ◽

2021 ◽

Vol 1 (2) ◽

pp. 20-37

Author(s):

Rajae KRIBII ◽

Youssef FAKIR

Keyword(s):

Data Mining ◽

Time Series ◽

Time Series Data ◽

Equivalence Classes ◽

Data Type ◽

Sequential Patterns ◽

Sequential Algorithm ◽

Series Data ◽

Data Set ◽

Specific Order

In recent times, the urge to collect data and analyze it has grown. Time stamping a data set is an important part of the analysis and data mining as it can give information that is more useful. Different mining techniques have been designed for mining time-series data, sequential patterns for example seeks relationships between occurrences of sequential events and finds if there exist any specific order of the occurrences. Many Algorithms has been proposed to study this data type based on the apriori approach. In this paper we compare two basic sequential algorithms which are General Sequential algorithm (GSP) and Sequential PAttern Discovery using Equivalence classes (SPADE). These two algorithms are based on the Apriori algorithms. Experimental results have shown that SPADE consumes less time than GSP algorithm.

Download Full-text

Anomaly detection in multidimensional time series— A graph-based approach

Journal of Physics: Complexity ◽

10.1088/2632-072x/ac392c ◽

2021 ◽

Author(s):

Marcus Erz ◽

Jeremy Floyd Kielman ◽

Bahar Selvi Uzun ◽

Gabriele Stefanie Guehring

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Distance Measures ◽

Series Data ◽

Data Set ◽

Research Areas ◽

Multidimensional Time Series ◽

Wide Range ◽

Time Frames

Abstract As the digital transformation is taking place, more and more data is being generated and collected.To generate meaningful information and knowledge researchers use various data mining techniques. In addition to classification, clustering, and forecasting, outlier or anomaly detection is one of the most important research areas in time series analysis. In this paper we present a method for detecting anomalies in multidimensional time series using a graph-based algorithm. We transform time series data to graphs prior to calculating the outlier since it offers a wide range of graph-based methods for anomaly detection. Furthermore the dynamics of the data is taken into consideration by implementing a window of a certain size that leads to multiple graphs in different time frames. We use feature extraction and aggregation to finally compare distance measures of two time-dependent graphs. The effectiveness of our algorithm is demonstrated on the Numenta Anomaly Benchmark with various anomaly types as well as the KPI-Anomaly-Detection data set of 2018 AIOps competition.

Download Full-text

Remaining Useful Life Prediction Using Temporal Convolution with Attention

AI ◽

10.3390/ai2010005 ◽

2021 ◽

Vol 2 (1) ◽

pp. 48-70

Author(s):

Wei Ming Tan ◽

T. Hui Teo

Keyword(s):

Neural Network ◽

Time Series ◽

Time Series Data ◽

Remaining Useful Life ◽

Sensor Data ◽

Series Data ◽

Multiple Time ◽

Data Set ◽

Form Complex ◽

Useful Life

Prognostic techniques attempt to predict the Remaining Useful Life (RUL) of a subsystem or a component. Such techniques often use sensor data which are periodically measured and recorded into a time series data set. Such multivariate data sets form complex and non-linear inter-dependencies through recorded time steps and between sensors. Many current existing algorithms for prognostic purposes starts to explore Deep Neural Network (DNN) and its effectiveness in the field. Although Deep Learning (DL) techniques outperform the traditional prognostic algorithms, the networks are generally complex to deploy or train. This paper proposes a Multi-variable Time Series (MTS) focused approach to prognostics that implements a lightweight Convolutional Neural Network (CNN) with attention mechanism. The convolution filters work to extract the abstract temporal patterns from the multiple time series, while the attention mechanisms review the information across the time axis and select the relevant information. The results suggest that the proposed method not only produces a superior accuracy of RUL estimation but it also trains many folds faster than the reported works. The superiority of deploying the network is also demonstrated on a lightweight hardware platform by not just being much compact, but also more efficient for the resource restricted environment.

Download Full-text

Particularities of data mining in medicine: lessons learned from patient medical time series data analysis

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-019-1582-2 ◽

2019 ◽

Vol 2019 (1) ◽

Cited By ~ 2

Author(s):

Shadi Aljawarneh ◽

Aurea Anguera ◽

John William Atwood ◽

Juan A. Lara ◽

David Lizcano

Keyword(s):

Data Mining ◽

Time Series ◽

Knowledge Discovery ◽

Time Series Data ◽

Medical Patient ◽

Lessons Learned ◽

Physiological Signals ◽

Knowledge Discovery In Databases ◽

Series Data ◽

Data Mining Techniques

AbstractNowadays, large amounts of data are generated in the medical domain. Various physiological signals generated from different organs can be recorded to extract interesting information about patients’ health. The analysis of physiological signals is a hard task that requires the use of specific approaches such as the Knowledge Discovery in Databases process. The application of such process in the domain of medicine has a series of implications and difficulties, especially regarding the application of data mining techniques to data, mainly time series, gathered from medical examinations of patients. The goal of this paper is to describe the lessons learned and the experience gathered by the authors applying data mining techniques to real medical patient data including time series. In this research, we carried out an exhaustive case study working on data from two medical fields: stabilometry (15 professional basketball players, 18 elite ice skaters) and electroencephalography (100 healthy patients, 100 epileptic patients). We applied a previously proposed knowledge discovery framework for classification purpose obtaining good results in terms of classification accuracy (greater than 99% in both fields). The good results obtained in our research are the groundwork for the lessons learned and recommendations made in this position paper that intends to be a guide for experts who have to face similar medical data mining projects.

Download Full-text

A Simple and Efficient Method for Fault Diagnosis Using Time Series Data Mining

2007 IEEE International Electric Machines & Drives Conference ◽

10.1109/iemdc.2007.382734 ◽

2007 ◽

Cited By ~ 10

Author(s):

I. Aydin ◽

M. Karakose ◽

E. Akin

Keyword(s):

Data Mining ◽

Time Series ◽

Fault Diagnosis ◽

Efficient Method ◽

Time Series Data ◽

Series Data ◽

Time Series Data Mining

Download Full-text

Adaptive Multiresolution and Dedicated Elastic Matching in Linear Time Complexity for Time Series Data Mining

Sixth International Conference on Intelligent Systems Design and Applications ◽

10.1109/isda.2006.84 ◽

2006 ◽

Cited By ~ 3

Author(s):

Pierre-francois Marteau ◽

Gildas Menier

Keyword(s):

Data Mining ◽

Time Series ◽

Time Complexity ◽

Time Series Data ◽

Linear Time ◽

Series Data ◽

Time Series Data Mining

Download Full-text

Studying monthly rainfall over Dibrugarh, Assam: Use of SARIMA approach

MAUSAM ◽

10.54302/mausam.v68i2.637 ◽

2021 ◽

Vol 68 (2) ◽

pp. 349-356

Author(s):

J. HAZARIKA ◽

B. PATHAK ◽

A. N. PATOWARY

Keyword(s):

Time Series ◽

Time Series Data ◽

Moving Average ◽

Demand Management ◽

Arima Model ◽

Monthly Rainfall ◽

Series Data ◽

Data Set ◽

Modeling And Forecasting ◽

Moving Average Model

Perceptive the rainfall pattern is tough for the solution of several regional environmental issues of water resources management, with implications for agriculture, climate change, and natural calamity such as floods and droughts. Statistical computing, modeling and forecasting data are key instruments for studying these patterns. The study of time series analysis and forecasting has become a major tool in different applications in hydrology and environmental fields. Among the most effective approaches for analyzing time series data is the ARIMA (Autoregressive Integrated Moving Average) model introduced by Box and Jenkins. In this study, an attempt has been made to use Box-Jenkins methodology to build ARIMA model for monthly rainfall data taken from Dibrugarh for the period of 1980- 2014 with a total of 420 points. We investigated and found that ARIMA (0, 0, 0) (0, 1, 1)12 model is suitable for the given data set. As such this model can be used to forecast the pattern of monthly rainfall for the upcoming years, which can help the decision makers to establish priorities in terms of agricultural, flood, water demand management etc.

Download Full-text