scholarly journals Hexadecimal Aggregate Approximation Representation and Classification of Time Series Data

Algorithms ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 353
Author(s):  
Zhenwen He ◽  
Chunfeng Zhang ◽  
Xiaogang Ma ◽  
Gang Liu

Time series data are widely found in finance, health, environmental, social, mobile and other fields. A large amount of time series data has been produced due to the general use of smartphones, various sensors, RFID and other internet devices. How a time series is represented is key to the efficient and effective storage and management of time series data, as well as being very important to time series classification. Two new time series representation methods, Hexadecimal Aggregate approXimation (HAX) and Point Aggregate approXimation (PAX), are proposed in this paper. The two methods represent each segment of a time series as a transformable interval object (TIO). Then, each TIO is mapped to a spatial point located on a two-dimensional plane. Finally, the HAX maps each point to a hexadecimal digit so that a time series is converted into a hex string. The experimental results show that HAX has higher classification accuracy than Symbolic Aggregate approXimation (SAX) but a lower one than some SAX variants (SAX-TD, SAX-BD). The HAX has the same space cost as SAX but is lower than these variants. The PAX has higher classification accuracy than HAX and is extremely close to the Euclidean distance (ED) measurement; however, the space cost of PAX is generally much lower than the space cost of ED. HAX and PAX are general representation methods that can also support geoscience time series clustering, indexing and query except for classification.

Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1908
Author(s):  
Chao Ma ◽  
Xiaochuan Shi ◽  
Wei Li ◽  
Weiping Zhu

In the past decade, time series data have been generated from various fields at a rapid speed, which offers a huge opportunity for mining valuable knowledge. As a typical task of time series mining, Time Series Classification (TSC) has attracted lots of attention from both researchers and domain experts due to its broad applications ranging from human activity recognition to smart city governance. Specifically, there is an increasing requirement for performing classification tasks on diverse types of time series data in a timely manner without costly hand-crafting feature engineering. Therefore, in this paper, we propose a framework named Edge4TSC that allows time series to be processed in the edge environment, so that the classification results can be instantly returned to the end-users. Meanwhile, to get rid of the costly hand-crafting feature engineering process, deep learning techniques are applied for automatic feature extraction, which shows competitive or even superior performance compared to state-of-the-art TSC solutions. However, because time series presents complex patterns, even deep learning models are not capable of achieving satisfactory classification accuracy, which motivated us to explore new time series representation methods to help classifiers further improve the classification accuracy. In the proposed framework Edge4TSC, by building the binary distribution tree, a new time series representation method was designed for addressing the classification accuracy concern in TSC tasks. By conducting comprehensive experiments on six challenging time series datasets in the edge environment, the potential of the proposed framework for its generalization ability and classification accuracy improvement is firmly validated with a number of helpful insights.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 284
Author(s):  
Zhenwen He ◽  
Shirong Long ◽  
Xiaogang Ma ◽  
Hong Zhao

A large amount of time series data is being generated every day in a wide range of sensor application domains. The symbolic aggregate approximation (SAX) is a well-known time series representation method, which has a lower bound to Euclidean distance and may discretize continuous time series. SAX has been widely used for applications in various domains, such as mobile data management, financial investment, and shape discovery. However, the SAX representation has a limitation: Symbols are mapped from the average values of segments, but SAX does not consider the boundary distance in the segments. Different segments with similar average values may be mapped to the same symbols, and the SAX distance between them is 0. In this paper, we propose a novel representation named SAX-BD (boundary distance) by integrating the SAX distance with a weighted boundary distance. The experimental results show that SAX-BD significantly outperforms the SAX representation, ESAX representation, and SAX-TD representation.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tuan D. Pham

AbstractAutomated analysis of physiological time series is utilized for many clinical applications in medicine and life sciences. Long short-term memory (LSTM) is a deep recurrent neural network architecture used for classification of time-series data. Here time–frequency and time–space properties of time series are introduced as a robust tool for LSTM processing of long sequential data in physiology. Based on classification results obtained from two databases of sensor-induced physiological signals, the proposed approach has the potential for (1) achieving very high classification accuracy, (2) saving tremendous time for data learning, and (3) being cost-effective and user-comfortable for clinical trials by reducing multiple wearable sensors for data recording.


2021 ◽  
Vol 352 ◽  
pp. 109080
Author(s):  
Joram van Driel ◽  
Christian N.L. Olivers ◽  
Johannes J. Fahrenfort

Processes ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 1115
Author(s):  
Gilseung Ahn ◽  
Hyungseok Yun ◽  
Sun Hur ◽  
Si-Yeong Lim

Accurate predictions of remaining useful life (RUL) of equipment using machine learning (ML) or deep learning (DL) models that collect data until the equipment fails are crucial for maintenance scheduling. Because the data are unavailable until the equipment fails, collecting sufficient data to train a model without overfitting can be challenging. Here, we propose a method of generating time-series data for RUL models to resolve the problems posed by insufficient data. The proposed method converts every training time series into a sequence of alphabetical strings by symbolic aggregate approximation and identifies occurrence patterns in the converted sequences. The method then generates a new sequence and inversely transforms it to a new time series. Experiments with various RUL prediction datasets and ML/DL models verified that the proposed data-generation model can help avoid overfitting in RUL prediction model.


1995 ◽  
Vol 115 (3) ◽  
pp. 354-360 ◽  
Author(s):  
Shigeaki Fukuda ◽  
Toshihisa Kosaka ◽  
Sigeru Omatsu

Author(s):  
Elangovan Ramanujam ◽  
S. Padmavathi

Innovations and applicability of time series data mining techniques have significantly increased the researchers' interest in the problem of time series classification. Several algorithms have been proposed for this purpose categorized under shapelet, interval, motif, and whole series-based techniques. Among this, the bag-of-words technique, an extensive application of the text mining approach, performs well due to its simplicity and effectiveness. To extend the efficiency of the bag-of-words technique, this paper proposes a discriminate supervised weighted scheme to identify the characteristic and representative pattern of a class for efficient classification. This paper uses a modified weighted matrix that discriminates the representative and non-representative pattern which enables the interpretability in classification. Experimentation has been carried out to compare the performance of the proposed technique with state-of-the-art techniques in terms of accuracy and statistical significance.


Sign in / Sign up

Export Citation Format

Share Document