Mining Delay in Streaming Time Series of Industrial Process

Author(s):  
Haijie Gu ◽  
Gang Rong
2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Sergio Martin del Campo Barraza ◽  
William Lindskog ◽  
Davide Badalotti ◽  
Oskar Liew ◽  
Arash Toyser

Data-based models built using machine learning solutions are becoming more prominent in the condition monitoring, maintenance, and prognostics fields. The capacity to build these models using a machine learning approach depends largely in the quality of the data. Of particular importance is the availability of labelled data, which describes the conditions that are intended to be identified. However, properly labelled data that is useful in many machine learning strategies is a scare resource. Furthermore, producing high-quality labelled data is expensive, time-consuming and a lot of times inaccurate given the uncertainty surrounding the labeling process and the annotators.  Active Learning (AL) has emerged as a semi-supervised approach that enables cost and time reductions of the labeling process. This approach has had a delayed adoption for time series classification given the difficulty to extract and present the time series information in such a way that it is easy to understand for the human annotator who incorporates the labels. This difficulty arises from the large dimensionality that many of these time series possess. This challenge is exacerbated by the cold-start problem, where the initial labelled dataset used in typical AL frameworks may not exist. Thus, the initial set of labels to be allocated to the time series samples is not available. This last challenge is particularly common on many condition monitoring applications where data samples of specific faults or problems does not exist. In this article, we present an AL framework to be used in the classification of time series from industrial process data, in particular vibration waveforms originated from condition monitoring applications. In this framework, we deal with the absence of labels to train an initial classification model by introducing a pre-clustering step. This step uses an unsupervised clustering algorithm to identify the number of labels and selects the points with a stronger group belonging as initial samples to be labelled in the active learning step. Furthermore, this framework presents two approaches to present the information to the annotator that can be via time-series imaging and automatic extraction of statistical features. Our work is motivated by the interest to facilitate the effort required for labeling time-series waveforms, while maintaining a high level of accuracy and consistency on those labels. In addition, we study the number of time-series samples that require to be labelled to achieve different levels of classification accuracy, as well as their confidence intervals. These experiments are carried out using vibration signals from a well-known rolling element bearing dataset and typical process data from a production plant.   An active learning framework that considers the conditions of the data commonly found in maintenance and condition monitoring applications while presenting the data in ways easy to interpret by human annotators can facilitate the generation reliable datasets. These datasets can, in turn, assist in the development of data-driven models that describe the many different processes that a machine undergoes.


2020 ◽  
Vol 10 (18) ◽  
pp. 6346
Author(s):  
Amaia Gil ◽  
Marco Quartulli ◽  
Igor G. Olaizola ◽  
Basilio Sierra

In industrial applications of data science and machine learning, most of the steps of a typical pipeline focus on optimizing measures of model fitness to the available data. Data preprocessing, instead, is often ad-hoc, and not based on the optimization of quantitative measures. This paper proposes the use of optimization in the preprocessing step, specifically studying a time series joining methodology, and introduces an error function to measure the adequateness of the joining. Experiments show how the method allows monitoring preprocessing errors for different time slices, indicating when a retraining of the preprocessing may be needed. Thus, this contribution helps quantifying the implications of data preprocessing on the result of data analysis and machine learning methods. The methodology is applied to two case studies: synthetic simulation data with controlled distortions, and a real scenario of an industrial process.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7273
Author(s):  
Julien Polge ◽  
Jérémy Robert ◽  
Yves Le Traon

With the Industry 4.0 paradigm comes the convergence of the Internet Technologies and Operational Technologies, and concepts, such as Industrial Internet of Things (IIoT), cloud manufacturing, Cyber-Physical Systems (CPS), and so on. These concepts bring industries into the big data era and allow for them to have access to potentially useful information in order to optimise the Overall Equipment Effectiveness (OEE); however, most European industries still rely on the Computer-Integrated Manufacturing (CIM) model, where the production systems run as independent systems (i.e., without any communication with the upper levels). Those production systems are controlled by a Programmable Logic Controller, in which a static and rigid program is implemented. This program is static and rigid in a sense that the programmed routines cannot evolve over the time unless a human modifies it. However, to go further in terms of flexibility, we are convinced that it requires moving away from the aforementioned old-fashioned and rigid automation to a ML-based automation, i.e., where the control itself is based on the decisions that were taken by ML algorithms. In order to verify this, we applied a time series classification method on a scale model of a factory using real industrial controllers, and widened the variety of parts the production line has to treat. This study shows that satisfactory results can be obtained only at the expense of the human expertise (i.e., in the industrial process and in the ML process).


1994 ◽  
Vol 144 ◽  
pp. 279-282
Author(s):  
A. Antalová

AbstractThe occurrence of LDE-type flares in the last three cycles has been investigated. The Fourier analysis spectrum was calculated for the time series of the LDE-type flare occurrence during the 20-th, the 21-st and the rising part of the 22-nd cycle. LDE-type flares (Long Duration Events in SXR) are associated with the interplanetary protons (SEP and STIP as well), energized coronal archs and radio type IV emission. Generally, in all the cycles considered, LDE-type flares mainly originated during a 6-year interval of the respective cycle (2 years before and 4 years after the sunspot cycle maximum). The following significant periodicities were found:• in the 20-th cycle: 1.4, 2.1, 2.9, 4.0, 10.7 and 54.2 of month,• in the 21-st cycle: 1.2, 1.6, 2.8, 4.9, 7.8 and 44.5 of month,• in the 22-nd cycle, till March 1992: 1.4, 1.8, 2.4, 7.2, 8.7, 11.8 and 29.1 of month,• in all interval (1969-1992):a)the longer periodicities: 232.1, 121.1 (the dominant at 10.1 of year), 80.7, 61.9 and 25.6 of month,b)the shorter periodicities: 4.7, 5.0, 6.8, 7.9, 9.1, 15.8 and 20.4 of month.Fourier analysis of the LDE-type flare index (FI) yields significant peaks at 2.3 - 2.9 months and 4.2 - 4.9 months. These short periodicities correspond remarkably in the all three last solar cycles. The larger periodicities are different in respective cycles.


Sign in / Sign up

Export Citation Format

Share Document