scholarly journals HIVE-COTE 2.0: a new meta ensemble for time series classification

2021 ◽  
Author(s):  
Matthew Middlehurst ◽  
James Large ◽  
Michael Flynn ◽  
Jason Lines ◽  
Aaron Bostrom ◽  
...  

AbstractThe Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is a heterogeneous meta ensemble for time series classification. HIVE-COTE forms its ensemble from classifiers of multiple domains, including phase-independent shapelets, bag-of-words based dictionaries and phase-dependent intervals. Since it was first proposed in 2016, the algorithm has remained state of the art for accuracy on the UCR time series classification archive. Over time it has been incrementally updated, culminating in its current state, HIVE-COTE 1.0. During this time a number of algorithms have been proposed which match the accuracy of HIVE-COTE. We propose comprehensive changes to the HIVE-COTE algorithm which significantly improve its accuracy and usability, presenting this upgrade as HIVE-COTE 2.0. We introduce two novel classifiers, the Temporal Dictionary Ensemble and Diverse Representation Canonical Interval Forest, which replace existing ensemble members. Additionally, we introduce the Arsenal, an ensemble of ROCKET classifiers as a new HIVE-COTE 2.0 constituent. We demonstrate that HIVE-COTE 2.0 is significantly more accurate on average than the current state of the art on 112 univariate UCR archive datasets and 26 multivariate UEA archive datasets.

Author(s):  
Elangovan Ramanujam ◽  
S. Padmavathi

Innovations and applicability of time series data mining techniques have significantly increased the researchers' interest in the problem of time series classification. Several algorithms have been proposed for this purpose categorized under shapelet, interval, motif, and whole series-based techniques. Among this, the bag-of-words technique, an extensive application of the text mining approach, performs well due to its simplicity and effectiveness. To extend the efficiency of the bag-of-words technique, this paper proposes a discriminate supervised weighted scheme to identify the characteristic and representative pattern of a class for efficient classification. This paper uses a modified weighted matrix that discriminates the representative and non-representative pattern which enables the interpretability in classification. Experimentation has been carried out to compare the performance of the proposed technique with state-of-the-art techniques in terms of accuracy and statistical significance.


Author(s):  
Lars Kegel ◽  
Claudio Hartmann ◽  
Maik Thiele ◽  
Wolfgang Lehner

AbstractProcessing and analyzing time series datasets have become a central issue in many domains requiring data management systems to support time series as a native data type. A core access primitive of time series is matching, which requires efficient algorithms on-top of appropriate representations like the symbolic aggregate approximation (SAX) representing the current state of the art. This technique reduces a time series to a low-dimensional space by segmenting it and discretizing each segment into a small symbolic alphabet. Unfortunately, SAX ignores the deterministic behavior of time series such as cyclical repeating patterns or a trend component affecting all segments, which may lead to a sub-optimal representation accuracy. We therefore introduce a novel season- and a trend-aware symbolic approximation and demonstrate an improved representation accuracy without increasing the memory footprint. Most importantly, our techniques also enable a more efficient time series matching by providing a match up to three orders of magnitude faster than SAX.


2021 ◽  
Vol 13 (22) ◽  
pp. 4599
Author(s):  
Félix Quinton ◽  
Loic Landrieu

While annual crop rotations play a crucial role for agricultural optimization, they have been largely ignored for automated crop type mapping. In this paper, we take advantage of the increasing quantity of annotated satellite data to propose to model simultaneously the inter- and intra-annual agricultural dynamics of yearly parcel classification with a deep learning approach. Along with simple training adjustments, our model provides an improvement of over 6.3% mIoU over the current state-of-the-art of crop classification, and a reduction of over 21% of the error rate. Furthermore, we release the first large-scale multi-year agricultural dataset with over 300,000 annotated parcels.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5829 ◽  
Author(s):  
Jen-Wei Huang ◽  
Meng-Xun Zhong ◽  
Bijay Prasad Jaysawal

Outlier detection in data streams is crucial to successful data mining. However, this task is made increasingly difficult by the enormous growth in the quantity of data generated by the expansion of Internet of Things (IoT). Recent advances in outlier detection based on the density-based local outlier factor (LOF) algorithms do not consider variations in data that change over time. For example, there may appear a new cluster of data points over time in the data stream. Therefore, we present a novel algorithm for streaming data, referred to as time-aware density-based incremental local outlier detection (TADILOF) to overcome this issue. In addition, we have developed a means for estimating the LOF score, termed "approximate LOF," based on historical information following the removal of outdated data. The results of experiments demonstrate that TADILOF outperforms current state-of-the-art methods in terms of AUC while achieving similar performance in terms of execution time. Moreover, we present an application of the proposed scheme to the development of an air-quality monitoring system.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3137
Author(s):  
Kevin Fauvel ◽  
Tao Lin ◽  
Véronique Masson ◽  
Élisa Fromont ◽  
Alexandre Termier

Multivariate Time Series (MTS) classification has gained importance over the past decade with the increase in the number of temporal datasets in multiple domains. The current state-of-the-art MTS classifier is a heavyweight deep learning approach, which outperforms the second-best MTS classifier only on large datasets. Moreover, this deep learning approach cannot provide faithful explanations as it relies on post hoc model-agnostic explainability methods, which could prevent its use in numerous applications. In this paper, we present XCM, an eXplainable Convolutional neural network for MTS classification. XCM is a new compact convolutional neural network which extracts information relative to the observed variables and time directly from the input data. Thus, XCM architecture enables a good generalization ability on both large and small datasets, while allowing the full exploitation of a faithful post hoc model-specific explainability method (Gradient-weighted Class Activation Mapping) by precisely identifying the observed variables and timestamps of the input data that are important for predictions. We first show that XCM outperforms the state-of-the-art MTS classifiers on both the large and small public UEA datasets. Then, we illustrate how XCM reconciles performance and explainability on a synthetic dataset and show that XCM enables a more precise identification of the regions of the input data that are important for predictions compared to the current deep learning MTS classifier also providing faithful explainability. Finally, we present how XCM can outperform the current most accurate state-of-the-art algorithm on a real-world application while enhancing explainability by providing faithful and more informative explanations.


Author(s):  
Benjamin Börschinger ◽  
Mark Johnson

Stress has long been established as a major cue in word segmentation for English infants. We show that enabling a current state-of-the-art Bayesian word segmentation model to take advantage of stress cues noticeably improves its performance. We find that the improvements range from 10 to 4%, depending on both the use of phonotactic cues and, to a lesser extent, the amount of evidence available to the learner. We also find that in particular early on, stress cues are much more useful for our model than phonotactic cues by themselves, consistent with the finding that children do seem to use stress cues before they use phonotactic cues. Finally, we study how the model’s knowledge about stress patterns evolves over time. We not only find that our model correctly acquires the most frequent patterns relatively quickly but also that the Unique Stress Constraint that is at the heart of a previously proposed model does not need to be built in but can be acquired jointly with word segmentation.


2018 ◽  
Author(s):  
Luis Gustavo C. Uzai ◽  
André Y. Kashiwabara

Time series are sequence of values distributed over time. Analyzing time series is important in many areas including medical, financial, aerospace, commercial and entertainment. Change Point Detection is the problem of identifying changes in meaning or distribution of data in a time series. This article presents Spec, a new algorithm that uses the graph spectrum to detect change points. The Spec was evaluated using the UCR Archive which is a large da- tabase of different time series. Spec performance was compared to the PELT, ECP, EDM, and gSeg algorithms. The results showed that Spec achieved a better accuracy compared to the state of the art in some specific scenarios and as efficient as in most cases evaluated.


2021 ◽  
pp. 1-14
Author(s):  
Haowen Zhang ◽  
Yabo Dong ◽  
Duanqing Xu

Time series classification is a fundamental problem in the time series mining community. Recently, many sophisticated methods which can produce state-of-the-art classification accuracy on the UCR archive have been proposed. Unfortunately, most of them are parameter-laden methods and require fine-tune for different datasets. Besides, training these classifiers is very computationally demanding, which makes them difficult to use in many real-time applications and previously unseen datasets. In this paper, we propose a novel parameter-light algorithm, MDTW, to classify time series. MDTW has a few parameters which do not require any fine-tune and can be chosen arbitrarily because the classification accuracy is largely insensitive to the parameters. MDTW has no training step; thus, it can be directly applied to unseen datasets. MDTW is based on a popular method, namely the nearest neighbor classifier with Dynamic Time Warping (NN-DTW). However, MDTW performs much faster than NN-DTW by representing time series in different resolutions and using filters-and-refine framework to find the nearest neighbor. The experimental results demonstrate that MDTW performs faster than the state-of-the-art, with small losses (<3%) in average classification accuracy. Besides, we embed a technique, prunedDTW, into the MDTW procedure to make MDTW even faster, and show by experiments that this combination can speed up the MDTW from one to five times.


Sign in / Sign up

Export Citation Format

Share Document