HIVE-COTE 2.0: a new meta ensemble for time series classification

Machine Learning ◽

10.1007/s10994-021-06057-9 ◽

2021 ◽

Author(s):

Matthew Middlehurst ◽

James Large ◽

Michael Flynn ◽

Jason Lines ◽

Aaron Bostrom ◽

...

Keyword(s):

Time Series ◽

State Of The Art ◽

Bag Of Words ◽

Time Series Classification ◽

Current State ◽

Multiple Domains ◽

Over Time

AbstractThe Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is a heterogeneous meta ensemble for time series classification. HIVE-COTE forms its ensemble from classifiers of multiple domains, including phase-independent shapelets, bag-of-words based dictionaries and phase-dependent intervals. Since it was first proposed in 2016, the algorithm has remained state of the art for accuracy on the UCR time series classification archive. Over time it has been incrementally updated, culminating in its current state, HIVE-COTE 1.0. During this time a number of algorithms have been proposed which match the accuracy of HIVE-COTE. We propose comprehensive changes to the HIVE-COTE algorithm which significantly improve its accuracy and usability, presenting this upgrade as HIVE-COTE 2.0. We introduce two novel classifiers, the Temporal Dictionary Ensemble and Diverse Representation Canonical Interval Forest, which replace existing ensemble members. Additionally, we introduce the Arsenal, an ensemble of ROCKET classifiers as a new HIVE-COTE 2.0 constituent. We demonstrate that HIVE-COTE 2.0 is significantly more accurate on average than the current state of the art on 112 univariate UCR archive datasets and 26 multivariate UEA archive datasets.

Download Full-text

Discriminate Supervised Weighted Scheme for the Classification of Time Series Signals

International Journal of Sociotechnology and Knowledge Development ◽

10.4018/ijskd.2021070101 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1-16

Author(s):

Elangovan Ramanujam ◽

S. Padmavathi

Keyword(s):

Time Series ◽

Time Series Data ◽

State Of The Art ◽

Statistical Significance ◽

Series Data ◽

Bag Of Words ◽

Time Series Classification ◽

Problem Of Time ◽

Weighted Matrix

Innovations and applicability of time series data mining techniques have significantly increased the researchers' interest in the problem of time series classification. Several algorithms have been proposed for this purpose categorized under shapelet, interval, motif, and whole series-based techniques. Among this, the bag-of-words technique, an extensive application of the text mining approach, performs well due to its simplicity and effectiveness. To extend the efficiency of the bag-of-words technique, this paper proposes a discriminate supervised weighted scheme to identify the characteristic and representative pattern of a class for efficient classification. This paper uses a modified weighted matrix that discriminates the representative and non-representative pattern which enables the interpretability in classification. Experimentation has been carried out to compare the performance of the proposed technique with state-of-the-art techniques in terms of accuracy and statistical significance.

Download Full-text

Season- and Trend-aware Symbolic Approximation for Accurate and Efficient Time Series Matching

Datenbank-Spektrum ◽

10.1007/s13222-021-00389-5 ◽

2021 ◽

Author(s):

Lars Kegel ◽

Claudio Hartmann ◽

Maik Thiele ◽

Wolfgang Lehner

Keyword(s):

Time Series ◽

State Of The Art ◽

Dimensional Space ◽

Symbolic Aggregate Approximation ◽

Current State ◽

Optimal Representation ◽

Symbolic Approximation ◽

Low Dimensional ◽

Deterministic Behavior ◽

Support Time

AbstractProcessing and analyzing time series datasets have become a central issue in many domains requiring data management systems to support time series as a native data type. A core access primitive of time series is matching, which requires efficient algorithms on-top of appropriate representations like the symbolic aggregate approximation (SAX) representing the current state of the art. This technique reduces a time series to a low-dimensional space by segmenting it and discretizing each segment into a small symbolic alphabet. Unfortunately, SAX ignores the deterministic behavior of time series such as cyclical repeating patterns or a trend component affecting all segments, which may lead to a sub-optimal representation accuracy. We therefore introduce a novel season- and a trend-aware symbolic approximation and demonstrate an improved representation accuracy without increasing the memory footprint. Most importantly, our techniques also enable a more efficient time series matching by providing a match up to three orders of magnitude faster than SAX.

Download Full-text

Crop Rotation Modeling for Deep Learning-Based Parcel Classification from Satellite Time Series

Remote Sensing ◽

10.3390/rs13224599 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4599

Author(s):

Félix Quinton ◽

Loic Landrieu

Keyword(s):

Time Series ◽

Deep Learning ◽

Crop Rotation ◽

Large Scale ◽

State Of The Art ◽

Crop Rotations ◽

Learning Approach ◽

Type Mapping ◽

Current State ◽

Crop Type

While annual crop rotations play a crucial role for agricultural optimization, they have been largely ignored for automated crop type mapping. In this paper, we take advantage of the increasing quantity of annotated satellite data to propose to model simultaneously the inter- and intra-annual agricultural dynamics of yearly parcel classification with a deep learning approach. Along with simple training adjustments, our model provides an improvement of over 6.3% mIoU over the current state-of-the-art of crop classification, and a reduction of over 21% of the error rate. Furthermore, we release the first large-scale multi-year agricultural dataset with over 300,000 annotated parcels.

Download Full-text

TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams

Sensors ◽

10.3390/s20205829 ◽

2020 ◽

Vol 20 (20) ◽

pp. 5829 ◽

Cited By ~ 1

Author(s):

Jen-Wei Huang ◽

Meng-Xun Zhong ◽

Bijay Prasad Jaysawal

Keyword(s):

Outlier Detection ◽

Data Streams ◽

Data Stream ◽

State Of The Art ◽

Streaming Data ◽

Current State ◽

Data Points ◽

Local Outlier ◽

Time Aware ◽

Over Time

Outlier detection in data streams is crucial to successful data mining. However, this task is made increasingly difficult by the enormous growth in the quantity of data generated by the expansion of Internet of Things (IoT). Recent advances in outlier detection based on the density-based local outlier factor (LOF) algorithms do not consider variations in data that change over time. For example, there may appear a new cluster of data points over time in the data stream. Therefore, we present a novel algorithm for streaming data, referred to as time-aware density-based incremental local outlier detection (TADILOF) to overcome this issue. In addition, we have developed a means for estimating the LOF score, termed "approximate LOF," based on historical information following the removal of outdated data. The results of experiments demonstrate that TADILOF outperforms current state-of-the-art methods in terms of AUC while achieving similar performance in terms of execution time. Moreover, we present an application of the proposed scheme to the development of an air-quality monitoring system.

Download Full-text

Time series classification with Bag-Of-Words approach

StuCoSReC. Proceedings of the 2018 5th Student Computer Science Research Conference. ◽

10.26493/978-961-7055-26-9.55-59 ◽

2010 ◽

Author(s):

Domen Kavran

Keyword(s):

Time Series ◽

Bag Of Words ◽

Time Series Classification

Download Full-text

XCM: An Explainable Convolutional Neural Network for Multivariate Time Series Classification

Mathematics ◽

10.3390/math9233137 ◽

2021 ◽

Vol 9 (23) ◽

pp. 3137

Author(s):

Kevin Fauvel ◽

Tao Lin ◽

Véronique Masson ◽

Élisa Fromont ◽

Alexandre Termier

Keyword(s):

Neural Network ◽

Time Series ◽

Deep Learning ◽

Convolutional Neural Network ◽

Input Data ◽

State Of The Art ◽

Multivariate Time Series ◽

Learning Approach ◽

Multiple Domains ◽

Post Hoc

Multivariate Time Series (MTS) classification has gained importance over the past decade with the increase in the number of temporal datasets in multiple domains. The current state-of-the-art MTS classifier is a heavyweight deep learning approach, which outperforms the second-best MTS classifier only on large datasets. Moreover, this deep learning approach cannot provide faithful explanations as it relies on post hoc model-agnostic explainability methods, which could prevent its use in numerous applications. In this paper, we present XCM, an eXplainable Convolutional neural network for MTS classification. XCM is a new compact convolutional neural network which extracts information relative to the observed variables and time directly from the input data. Thus, XCM architecture enables a good generalization ability on both large and small datasets, while allowing the full exploitation of a faithful post hoc model-specific explainability method (Gradient-weighted Class Activation Mapping) by precisely identifying the observed variables and timestamps of the input data that are important for predictions. We first show that XCM outperforms the state-of-the-art MTS classifiers on both the large and small public UEA datasets. Then, we illustrate how XCM reconciles performance and explainability on a synthetic dataset and show that XCM enables a more precise identification of the regions of the input data that are important for predictions compared to the current deep learning MTS classifier also providing faithful explainability. Finally, we present how XCM can outperform the current most accurate state-of-the-art algorithm on a real-world application while enhancing explainability by providing faithful and more informative explanations.

Download Full-text

Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00168 ◽

2014 ◽

Vol 2 ◽

pp. 93-104 ◽

Cited By ~ 2

Author(s):

Benjamin Börschinger ◽

Mark Johnson

Keyword(s):

State Of The Art ◽

Word Segmentation ◽

Stress Constraint ◽

Frequent Patterns ◽

Current State ◽

Proposed Model ◽

Stress Patterns ◽

A Current ◽

Over Time

Stress has long been established as a major cue in word segmentation for English infants. We show that enabling a current state-of-the-art Bayesian word segmentation model to take advantage of stress cues noticeably improves its performance. We find that the improvements range from 10 to 4%, depending on both the use of phonotactic cues and, to a lesser extent, the amount of evidence available to the learner. We also find that in particular early on, stress cues are much more useful for our model than phonotactic cues by themselves, consistent with the finding that children do seem to use stress cues before they use phonotactic cues. Finally, we study how the model’s knowledge about stress patterns evolves over time. We not only find that our model correctly acquires the most frequent patterns relatively quickly but also that the Unique Stress Constraint that is at the heart of a previously proposed model does not need to be built in but can be acquired jointly with word segmentation.

Download Full-text

Using Graph Spectral to solve Change Point Detection Problems

10.5753/eniac.2018.4461 ◽

2018 ◽

Author(s):

Luis Gustavo C. Uzai ◽

André Y. Kashiwabara

Keyword(s):

Time Series ◽

Change Point ◽

State Of The Art ◽

The State ◽

Change Point Detection ◽

Change Points ◽

Graph Spectrum ◽

Point Detection ◽

Over Time

Time series are sequence of values distributed over time. Analyzing time series is important in many areas including medical, financial, aerospace, commercial and entertainment. Change Point Detection is the problem of identifying changes in meaning or distribution of data in a time series. This article presents Spec, a new algorithm that uses the graph spectrum to detect change points. The Spec was evaluated using the UCR Archive which is a large da- tabase of different time series. Spec performance was compared to the PELT, ECP, EDM, and gSeg algorithms. The results showed that Spec achieved a better accuracy compared to the state of the art in some specific scenarios and as efficient as in most cases evaluated.

Download Full-text

An Examination of the State-of-the-Art for Multivariate Time Series Classification

2020 International Conference on Data Mining Workshops (ICDMW) ◽

10.1109/icdmw51313.2020.00042 ◽

2020 ◽

Author(s):

Bhaskar Dhariyal ◽

Thach le Nguyen ◽

Severin Gsponer ◽

Georgiana Ifrim

Keyword(s):

Time Series ◽

State Of The Art ◽

Multivariate Time Series ◽

The State ◽

Time Series Classification

Download Full-text

Multilevel dynamic time warping: A parameter-light method for fast time series classification

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201281 ◽

2021 ◽

pp. 1-14

Author(s):

Haowen Zhang ◽

Yabo Dong ◽

Duanqing Xu

Keyword(s):

Time Series ◽

Classification Accuracy ◽

Dynamic Time Warping ◽

Nearest Neighbor ◽

State Of The Art ◽

Fast Time ◽

Time Series Classification ◽

Time Warping ◽

Dynamic Time ◽

Fine Tune

Time series classification is a fundamental problem in the time series mining community. Recently, many sophisticated methods which can produce state-of-the-art classification accuracy on the UCR archive have been proposed. Unfortunately, most of them are parameter-laden methods and require fine-tune for different datasets. Besides, training these classifiers is very computationally demanding, which makes them difficult to use in many real-time applications and previously unseen datasets. In this paper, we propose a novel parameter-light algorithm, MDTW, to classify time series. MDTW has a few parameters which do not require any fine-tune and can be chosen arbitrarily because the classification accuracy is largely insensitive to the parameters. MDTW has no training step; thus, it can be directly applied to unseen datasets. MDTW is based on a popular method, namely the nearest neighbor classifier with Dynamic Time Warping (NN-DTW). However, MDTW performs much faster than NN-DTW by representing time series in different resolutions and using filters-and-refine framework to find the nearest neighbor. The experimental results demonstrate that MDTW performs faster than the state-of-the-art, with small losses (<3%) in average classification accuracy. Besides, we embed a technique, prunedDTW, into the MDTW procedure to make MDTW even faster, and show by experiments that this combination can speed up the MDTW from one to five times.

Download Full-text