symbolic aggregate approximation Latest Research Papers

Hexadecimal Aggregate Approximation Representation and Classification of Time Series Data

Algorithms ◽

10.3390/a14120353 ◽

2021 ◽

Vol 14 (12) ◽

pp. 353

Author(s):

Zhenwen He ◽

Chunfeng Zhang ◽

Xiaogang Ma ◽

Gang Liu

Keyword(s):

Time Series ◽

Classification Accuracy ◽

Euclidean Distance ◽

Time Series Data ◽

Series Representation ◽

Series Data ◽

General Representation ◽

Symbolic Aggregate Approximation ◽

Space Cost

Time series data are widely found in finance, health, environmental, social, mobile and other fields. A large amount of time series data has been produced due to the general use of smartphones, various sensors, RFID and other internet devices. How a time series is represented is key to the efficient and effective storage and management of time series data, as well as being very important to time series classification. Two new time series representation methods, Hexadecimal Aggregate approXimation (HAX) and Point Aggregate approXimation (PAX), are proposed in this paper. The two methods represent each segment of a time series as a transformable interval object (TIO). Then, each TIO is mapped to a spatial point located on a two-dimensional plane. Finally, the HAX maps each point to a hexadecimal digit so that a time series is converted into a hex string. The experimental results show that HAX has higher classification accuracy than Symbolic Aggregate approXimation (SAX) but a lower one than some SAX variants (SAX-TD, SAX-BD). The HAX has the same space cost as SAX but is lower than these variants. The PAX has higher classification accuracy than HAX and is extremely close to the Euclidean distance (ED) measurement; however, the space cost of PAX is generally much lower than the space cost of ED. HAX and PAX are general representation methods that can also support geoscience time series clustering, indexing and query except for classification.

Tri-Partition Alphabet-Based State Prediction for Multivariate Time-Series

Applied Sciences ◽

10.3390/app112311294 ◽

2021 ◽

Vol 11 (23) ◽

pp. 11294

Author(s):

Zuo-Cheng Wen ◽

Zhi-Heng Zhang ◽

Xiang-Bing Zhou ◽

Jian-Gang Gu ◽

Shao-Peng Shen ◽

...

Keyword(s):

Time Series ◽

Multivariate Time Series ◽

Prediction Method ◽

Distance Metrics ◽

Approximation Techniques ◽

Symbolic Aggregate Approximation ◽

State Prediction ◽

Similarity Model ◽

Time Stamps ◽

Deviation Degree

Recently, predicting multivariate time-series (MTS) has attracted much attention to obtain richer semantics with similar or better performances. In this paper, we propose a tri-partition alphabet-based state (tri-state) prediction method for symbolic MTSs. First, for each variable, the set of all symbols, i.e., alphabets, is divided into strong, medium, and weak using two user-specified thresholds. With the tri-partitioned alphabet, the tri-state takes the form of a matrix. One order contains the whole variables. The other is a feature vector that includes the most likely occurring strong, medium, and weak symbols. Second, a tri-partition strategy based on the deviation degree is proposed. We introduce the piecewise and symbolic aggregate approximation techniques to polymerize and discretize the original MTS. This way, the symbol is stronger and has a bigger deviation. Moreover, most popular numerical or symbolic similarity or distance metrics can be combined. Third, we propose an along–across similarity model to obtain the k-nearest matrix neighbors. This model considers the associations among the time stamps and variables simultaneously. Fourth, we design two post-filling strategies to obtain a completed tri-state. The experimental results from the four-domain datasets show that (1) the tri-state has greater recall but lower precision; (2) the two post-filling strategies can slightly improve the recall; and (3) the along–across similarity model composed by the Triangle and Jaccard metrics are first recommended for new datasets.

A novel fault diagnosis scheme for rolling bearing based on symbolic aggregate approximation and convolutional neural network with channel attention

Measurement Science and Technology ◽

10.1088/1361-6501/ac319a ◽

2021 ◽

Author(s):

Bo Wang ◽

Yi Ning ◽

Yahu Zhang

Keyword(s):

Neural Network ◽

Fault Diagnosis ◽

Convolutional Neural Network ◽

Rolling Bearing ◽

Symbolic Aggregate Approximation

Identifying Subway Passenger Flow under Large-Scale Events Using Symbolic Aggregate Approximation Algorithm

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211047835 ◽

2021 ◽

pp. 036119812110478

Author(s):

Hainan Huang ◽

Rongjie Zhang ◽

Chengguang Xie ◽

Xiaofeng Li

Keyword(s):

Large Scale ◽

Urban Rail Transit ◽

Sporting Events ◽

Transit Systems ◽

Symbolic Aggregate Approximation ◽

Social Events ◽

Urban Rail ◽

Subway System ◽

Passenger Flows ◽

Dynamic Time

Various social events, such as holidays, important sporting events, and major celebrations, may result in sudden large-scale passenger flows in certain sections and stations of urban rail transit systems. The sudden inbound passenger flows caused by these events can easily lead to continuous congestion of the subway network, which has a profound impact on the safety, reliability, and stability of a subway system. Because of the large magnitude of swipe data and the high dimensionality of time series, it is difficult to identify the emergence of such large passenger flows. Additionally, the recognition accuracy of the existing identification methods cannot meet the operational monitoring requirements. To address the above-mentioned issues, this paper proposes an optimized symbolic aggregate approximation (SAX) algorithm to identify historical sudden passenger flows caused by large-scale events around subways. Specifically, pre-set cluster types and dynamic time warping (DTW) are proposed to enhance the matching rate. Compared with the K-means method, the proposed method exhibits an average increase of 30% in mining accuracy, and the calculation time is shortened to one-sixteenth of the original value.

Season- and Trend-aware Symbolic Approximation for Accurate and Efficient Time Series Matching

Datenbank-Spektrum ◽

10.1007/s13222-021-00389-5 ◽

2021 ◽

Author(s):

Lars Kegel ◽

Claudio Hartmann ◽

Maik Thiele ◽

Wolfgang Lehner

Keyword(s):

Time Series ◽

State Of The Art ◽

Dimensional Space ◽

Symbolic Aggregate Approximation ◽

Current State ◽

Optimal Representation ◽

Symbolic Approximation ◽

Low Dimensional ◽

Deterministic Behavior ◽

Support Time

AbstractProcessing and analyzing time series datasets have become a central issue in many domains requiring data management systems to support time series as a native data type. A core access primitive of time series is matching, which requires efficient algorithms on-top of appropriate representations like the symbolic aggregate approximation (SAX) representing the current state of the art. This technique reduces a time series to a low-dimensional space by segmenting it and discretizing each segment into a small symbolic alphabet. Unfortunately, SAX ignores the deterministic behavior of time series such as cyclical repeating patterns or a trend component affecting all segments, which may lead to a sub-optimal representation accuracy. We therefore introduce a novel season- and a trend-aware symbolic approximation and demonstrate an improved representation accuracy without increasing the memory footprint. Most importantly, our techniques also enable a more efficient time series matching by providing a match up to three orders of magnitude faster than SAX.

Analysis of Fluctuation Patterns in Emotional States Using Electrodermal Activity Signals and Improved Symbolic Aggregate Approximation

Fluctuation and Noise Letters ◽

10.1142/s0219477522500134 ◽

2021 ◽

Author(s):

Yedukondala Rao Veeranki ◽

Nagarajan Ganapathy ◽

Ramakrishnan Swaminathan

Keyword(s):

Support Vector Machine ◽

Electrodermal Activity ◽

Maximum Amplitude ◽

Machine Learning Algorithms ◽

Support Vector ◽

Clinical Settings ◽

Emotional States ◽

Rotation Forest ◽

Symbolic Aggregate Approximation ◽

Symbolic Sequences

Analysis of fluctuations in electrodermal activity (EDA) signals is widely preferred for emotion recognition. In this work, an attempt has been made to determine the patterns of fluctuations in EDA signals for various emotional states using improved symbolic aggregate approximation. For this, the EDA is obtained from a publicly available online database. The EDA is decomposed into phasic components and divided into equal segments. Each segment is transformed into a piecewise aggregate approximation (PAA). These approximations are discretized using 11 time-domain features to obtain symbolic sequences. Shannon entropy is extracted from each PAA-based symbolic sequence using varied symbol size [Formula: see text] and window length [Formula: see text]. Three machine-learning algorithms, namely Naive Bayes, support vector machine and rotation forest, are used for the classification. The results show that the proposed approach is able to determine the patterns of fluctuations for various emotional states in EDA signals. PAA features, namely maximum amplitude and chaos, significantly identify the subtle fluctuations in EDA and transforms them in symbolic sequences. The optimal values of [Formula: see text] and [Formula: see text] yield the highest performance. The rotation forest is accurate (F-[Formula: see text] and 60.02% for arousal and valence dimensions) in classifying various emotional states. The proposed approach can capture the patterns of fluctuations for varied-length signals. Particularly, the support vector machine yields the highest performance for a lower length of signals. Thus, it appears that the proposed method might be utilized to analyze various emotional states in both normal and clinical settings.

An Effective Algorithm for Intrusion Detection Using Random Shapelet Forest

Wireless Communications and Mobile Computing ◽

10.1155/2021/4214784 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Gongliang Li ◽

Mingyong Yin ◽

Siyuan Jing ◽

Bing Guo

Keyword(s):

Time Series ◽

Intrusion Detection ◽

Traffic Flow ◽

Network Traffic ◽

Time Complexity ◽

Sampling Technique ◽

Flow Patterns ◽

Detection Systems ◽

Symbolic Aggregate Approximation ◽

Time Series Mining

Detection of abnormal network traffic is an important issue when builds intrusion detection systems. An effective way to address this issue is time series mining, in which the network traffic is naturally represented as a set of time series. In this paper, we propose a novel efficient algorithm, called RSFID (Random Shapelet Forest for Intrusion Detection), to detect abnormal traffic flow patterns in periodic network packets. Firstly, the Fast Correlation-based Filter (FCBF) algorithm is employed to remove irrelevant features to decrease the overfitting as well as the time complexity. Then, a random forest which is built upon a set of shapelet candidates is used to classify the normal and abnormal traffic flow patterns. Specifically, the Symbolic Aggregate approXimation (SAX) and random sampling technique are adopted to mitigate the high time complexity caused by enumerating shapelet candidates. Experimental results show the effectiveness and efficiency of the proposed algorithm.

A Time-Series Data Generation Method to Predict Remaining Useful Life

Processes ◽

10.3390/pr9071115 ◽

2021 ◽

Vol 9 (7) ◽

pp. 1115

Author(s):

Gilseung Ahn ◽

Hyungseok Yun ◽

Sun Hur ◽

Si-Yeong Lim

Keyword(s):

Time Series ◽

Time Series Data ◽

Remaining Useful Life ◽

Series Data ◽

Generation Model ◽

Data Generation ◽

Training Time ◽

Symbolic Aggregate Approximation ◽

Useful Life ◽

Occurrence Patterns

Accurate predictions of remaining useful life (RUL) of equipment using machine learning (ML) or deep learning (DL) models that collect data until the equipment fails are crucial for maintenance scheduling. Because the data are unavailable until the equipment fails, collecting sufficient data to train a model without overfitting can be challenging. Here, we propose a method of generating time-series data for RUL models to resolve the problems posed by insufficient data. The proposed method converts every training time series into a sequence of alphabetical strings by symbolic aggregate approximation and identifies occurrence patterns in the converted sequences. The method then generates a new sequence and inversely transforms it to a new time series. Experiments with various RUL prediction datasets and ML/DL models verified that the proposed data-generation model can help avoid overfitting in RUL prediction model.

Pattern Recognition of Traction Energy Consumption for Urban Rail Transit by Using Symbolic Aggregate Approximation

2021 IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS) ◽

10.1109/ddcls52934.2021.9455709 ◽

2021 ◽

Author(s):

Licheng Zhang ◽

Jing Xun ◽

Wei Zhang ◽

Xi Li ◽

Yanlong Zhang

Keyword(s):

Pattern Recognition ◽

Energy Consumption ◽

Urban Rail Transit ◽

Rail Transit ◽

Symbolic Aggregate Approximation ◽

Urban Rail

Correction to Symbolic Aggregate Approximation Improves Gap Filling in High-Resolution Mass Spectrometry Data Processing

Analytical Chemistry ◽

10.1021/acs.analchem.1c00530 ◽

2021 ◽

Author(s):

Erik Müller ◽

Carolin Elisabeth Huber ◽

Werner Brack ◽

Martin Krauss ◽

Tobias Schulze

Keyword(s):

Mass Spectrometry ◽

High Resolution ◽

Data Processing ◽

High Resolution Mass Spectrometry ◽

Mass Spectrometry Data ◽

Gap Filling ◽

Symbolic Aggregate Approximation ◽

High Resolution Mass ◽

Resolution Mass

symbolic aggregate approximation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Hexadecimal Aggregate Approximation Representation and Classification of Time Series Data

Tri-Partition Alphabet-Based State Prediction for Multivariate Time-Series

A novel fault diagnosis scheme for rolling bearing based on symbolic aggregate approximation and convolutional neural network with channel attention

Identifying Subway Passenger Flow under Large-Scale Events Using Symbolic Aggregate Approximation Algorithm

Season- and Trend-aware Symbolic Approximation for Accurate and Efficient Time Series Matching

Analysis of Fluctuation Patterns in Emotional States Using Electrodermal Activity Signals and Improved Symbolic Aggregate Approximation

An Effective Algorithm for Intrusion Detection Using Random Shapelet Forest

A Time-Series Data Generation Method to Predict Remaining Useful Life

Pattern Recognition of Traction Energy Consumption for Urban Rail Transit by Using Symbolic Aggregate Approximation

Correction to Symbolic Aggregate Approximation Improves Gap Filling in High-Resolution Mass Spectrometry Data Processing

Export Citation Format

symbolic aggregate approximationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Hexadecimal Aggregate Approximation Representation and Classification of Time Series Data

Tri-Partition Alphabet-Based State Prediction for Multivariate Time-Series

A novel fault diagnosis scheme for rolling bearing based on symbolic aggregate approximation and convolutional neural network with channel attention

Identifying Subway Passenger Flow under Large-Scale Events Using Symbolic Aggregate Approximation Algorithm

Season- and Trend-aware Symbolic Approximation for Accurate and Efficient Time Series Matching

Analysis of Fluctuation Patterns in Emotional States Using Electrodermal Activity Signals and Improved Symbolic Aggregate Approximation

An Effective Algorithm for Intrusion Detection Using Random Shapelet Forest

A Time-Series Data Generation Method to Predict Remaining Useful Life

Pattern Recognition of Traction Energy Consumption for Urban Rail Transit by Using Symbolic Aggregate Approximation

Correction to Symbolic Aggregate Approximation Improves Gap Filling in High-Resolution Mass Spectrometry Data Processing

symbolic aggregate approximation
Recently Published Documents