Anomaly Detection of Pedestrian Flow: A Machine Learning Method for Monitoring-Data of Visitors to a Building

Collective Dynamics ◽

10.17815/cd.2020.31 ◽

2020 ◽

Vol 5 ◽

Author(s):

Kentaro Kumagai

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Time Series Data ◽

Emergency Evacuation ◽

Integrated System ◽

Series Data ◽

Machine Learning Method ◽

Pedestrian Flow ◽

Infrared Sensors ◽

Public Facilities

Many public facilities such as community halls and gymnasiums are supposed to be evacuation sites when disasters occur. From the viewpoint of managing such facilities, it is necessary to monitor the usage and to respond immediately when an anomaly occurs. In this study, an integrated system of IoT sensors and machine learning for anomaly detection of pedestrian flow was proposed for buildings that are expected to be used as emergency evacuation sites in the event of a disaster. For trial practice of the system, infrared sensors were installed in a research building of a university, and data of visitors to the fourth floor of the building was collected as a time series data of pedestrian flow. As a result, it was shown that anomalies of pedestrian flow at an arbitrary time of a day with an occurrence probability of 5 % or less can be detected properly using the data collected.

Download Full-text

Smart Helmet: Thresh-Learner–Online Machine Learning on Data Streams

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1085.1291s319 ◽

2019 ◽

Vol 9 (1S3) ◽

pp. 466-473

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Time Series Data ◽

Real Life ◽

General Purpose ◽

Streaming Data ◽

Series Data ◽

Dynamic Threshold ◽

Wide Range ◽

Set Up

Today, with an enormous generation and availability of time series data and streaming data, there is an increasing need for an automatic analyzing architecture to get fast interpretations and results. One of the significant potentiality of streaming analytics is to train and model each stream with unsupervised Machine Learning (ML) algorithms to detect anomalous behaviors, fuzzy patterns, and accidents in real-time. If executed reliably, each anomaly detection can be highly valuable for the application. In this paper, we propose a dynamic threshold setting system denoted as Thresh-Learner, mainly for the Internet of Things (IoT) applications that require anomaly detection. The proposed model enables a wide range of real-life applications where there is a necessity to set up a dynamic threshold over the streaming data to avoid anomalies, accidents or sending alerts to distant monitoring stations. We took the major problem of anomalies and accidents in coal mines due to coal fires and explosions. This results in loss of life due to the lack of automated alarming systems. We propose Thresh-Learner, a general purpose implementation for setting dynamic thresholds. We illustrate it through the Smart Helmet for coal mine workers which seamlessly integrates monitoring, analyzing and dynamic thresholds using IoT and analysis on the cloud.

Download Full-text

Time Series Data Analysis and Fault Diagnosis of Plant Process Equipment Using Statistical Machine Learning Method

Korean Journal of Computational Design and Engineering ◽

10.7315/cde.2018.193 ◽

2018 ◽

Vol 23 (3) ◽

pp. 193-201

Author(s):

Se-Yun Hwang ◽

Jeeyeon Heo ◽

Kyu-Tack Hong ◽

Jang-Hyun Lee

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Series Data ◽

Process Equipment ◽

Machine Learning Method ◽

Learning Method ◽

Statistical Machine Learning ◽

Time Series Data Analysis ◽

Plant Process

Download Full-text

Self-Diagnosis of Multiphase Flow Meters through Machine Learning-Based Anomaly Detection

Energies ◽

10.3390/en13123136 ◽

2020 ◽

Vol 13 (12) ◽

pp. 3136

Author(s):

Tommaso Barbariol ◽

Enrico Feltresi ◽

Gian Antonio Susto

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Multiphase Flow ◽

Time Series Data ◽

Synthetic Data ◽

Machine Learning Algorithms ◽

Gas Content ◽

Series Data ◽

Oil Well ◽

Flow Meters

Measuring systems are becoming increasingly sophisticated in order to tackle the challenges of modern industrial problems. In particular, the Multiphase Flow Meter (MPFM) combines different sensors and data fusion techniques to estimate quantities that are difficult to be measured like the water or gas content of a multiphase flow, coming from an oil well. The evaluation of the flow composition is essential for the well productivity prediction and management, and for this reason, the quantification of the meter measurement quality is crucial. While instrument complexity is increasing, demands for confidence levels in the provided measures are becoming increasingly more common. In this work, we propose an Anomaly Detection approach, based on unsupervised Machine Learning algorithms, that enables the metrology system to detect outliers and to provide a statistical level of confidence in the measures. The proposed approach, called AD4MPFM (Anomaly Detection for Multiphase Flow Meters), is designed for embedded implementation and for multivariate time-series data streams. The approach is validated both on real and synthetic data.

Download Full-text

Anomaly Detection with Machine Learning Algorithms and Big Data in Electricity Consumption

Sustainability ◽

10.3390/su131910963 ◽

2021 ◽

Vol 13 (19) ◽

pp. 10963

Author(s):

Simona-Vasilica Oprea ◽

Adela Bâra ◽

Florina Camelia Puican ◽

Ioan Cosmin Radu

Keyword(s):

Machine Learning ◽

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Hybrid Approach ◽

Electricity Consumption ◽

Machine Learning Algorithms ◽

Series Data ◽

Smart Meters ◽

Linear Discriminant

When analyzing smart metering data, both reading errors and frauds can be identified. The purpose of this analysis is to alert the utility companies to suspicious consumption behavior that could be further investigated with on-site inspections or other methods. The use of Machine Learning (ML) algorithms to analyze consumption readings can lead to the identification of malfunctions, cyberattacks interrupting measurements, or physical tampering with smart meters. Fraud detection is one of the classical anomaly detection examples, as it is not easy to label consumption or transactional data. Furthermore, frauds differ in nature, and learning is not always possible. In this paper, we analyze large datasets of readings provided by smart meters installed in a trial study in Ireland by applying a hybrid approach. More precisely, we propose an unsupervised ML technique to detect anomalous values in the time series, establish a threshold for the percentage of anomalous readings from the total readings, and then label that time series as suspicious or not. Initially, we propose two types of algorithms for anomaly detection for unlabeled data: Spectral Residual-Convolutional Neural Network (SR-CNN) and an anomaly trained model based on martingales for determining variations in time-series data streams. Then, the Two-Class Boosted Decision Tree and Fisher Linear Discriminant analysis are applied on the previously processed dataset. By training the model, we obtain the required capabilities of detecting suspicious consumers proved by an accuracy of 90%, precision score of 0.875, and F1 score of 0.894.

Download Full-text

Detecting Interesting and Anomalous Patterns In Multivariate Time-Series Data in an Offshore Platform Using Unsupervised Learning

10.4043/31297-ms ◽

2021 ◽

Author(s):

Ilan Sousa Figueirêdo ◽

Tássio Farias Carvalho ◽

Wenisten José Dantas Silva ◽

Lílian Lefol Nani Guarieiro ◽

Erick Giovani Sperandio Nascimento

Keyword(s):

Machine Learning ◽

Time Series ◽

Anomaly Detection ◽

Unsupervised Learning ◽

Time Series Data ◽

Multivariate Time Series ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Series Data ◽

Unsupervised Machine Learning

Abstract Detection of anomalous events in practical operation of oil and gas (O&G) wells and lines can help to avoid production losses, environmental disasters, and human fatalities, besides decreasing maintenance costs. Supervised machine learning algorithms have been successful to detect, diagnose, and forecast anomalous events in O&G industry. Nevertheless, these algorithms need a large quantity of annotated dataset and labelling data in real world scenarios is typically unfeasible because of exhaustive work of experts. Therefore, as unsupervised machine learning does not require an annotated dataset, this paper intends to perform a comparative evaluation performance of unsupervised learning algorithms to support experts for anomaly detection and pattern recognition in multivariate time-series data. So, the goal is to allow experts to analyze a small set of patterns and label them, instead of analyzing large datasets. This paper used the public 3W database of three offshore naturally flowing wells. The experiment used real data of production of O&G from underground reservoirs with the following anomalous events: (i) spurious closure of Downhole Safety Valve (DHSV) and (ii) quick restriction in Production Choke (PCK). Six unsupervised machine learning algorithms were assessed: Cluster-based Algorithm for Anomaly Detection in Time Series Using Mahalanobis Distance (C-AMDATS), Luminol Bitmap, SAX-REPEAT, k-NN, Bootstrap, and Robust Random Cut Forest (RRCF). The comparison evaluation of unsupervised learning algorithms was performed using a set of metrics: accuracy (ACC), precision (PR), recall (REC), specificity (SP), F1-Score (F1), Area Under the Receiver Operating Characteristic Curve (AUC-ROC), and Area Under the Precision-Recall Curve (AUC-PRC). The experiments only used the data labels for assessment purposes. The results revealed that unsupervised learning successfully detected the patterns of interest in multivariate data without prior annotation, with emphasis on the C-AMDATS algorithm. Thus, unsupervised learning can leverage supervised models through the support given to data annotation.

Download Full-text

Towards Machine Learning-based Anomaly Detection on Time-Series Data

Infocommunications journal ◽

10.36244/icj.2021.1.5 ◽

2021 ◽

Vol 13 (1) ◽

pp. 35-44

Author(s):

Daniel Vajda ◽

Adrian Pekar ◽

Karoly Farkas

Keyword(s):

Machine Learning ◽

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Short Term Memory ◽

Learning Algorithm ◽

State Of The Art ◽

Detection Methods ◽

Series Data ◽

Rich Information

The complexity of network infrastructures is exponentially growing. Real-time monitoring of these infrastructures is essential to secure their reliable operation. The concept of telemetry has been introduced in recent years to foster this process by streaming time-series data that contain feature-rich information concerning the state of network components. In this paper, we focus on a particular application of telemetry — anomaly detection on time-series data. We rigorously examined state-of-the-art anomaly detection methods. Upon close inspection of the methods, we observed that none of them suits our requirements as they typically face several limitations when applied on time-series data. This paper presents Alter-Re2, an improved version of ReRe, a state-of-the-art Long Short- Term Memory-based machine learning algorithm. Throughout a systematic examination, we demonstrate that by introducing the concepts of ageing and sliding window, the major limitations of ReRe can be overcome. We assessed the efficacy of Alter-Re2 using ten different datasets and achieved promising results. Alter-Re2 performs three times better on average when compared to ReRe.

Download Full-text

A machine learning method to determine intrinsic dimension of time series data

2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP) ◽

10.1109/globalsip.2017.8308653 ◽

2017 ◽

Author(s):

Claudio Turchetti ◽

Laura Falaschetti

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Series Data ◽

Machine Learning Method ◽

Learning Method ◽

Intrinsic Dimension

Download Full-text

Anomaly Detection of Big Time Series Data Using Machine Learning

Journal of Society of Korea Industrial and Systems Engineering ◽

10.11627/jkise.2020.43.2.033 ◽

2020 ◽

Vol 43 (2) ◽

pp. 33-38

Author(s):

Sehyug Kwon

Keyword(s):

Machine Learning ◽

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Series Data

Download Full-text

Imputation by feature importance (IBFI): A methodology to envelop machine learning method for imputing missing patterns in time series data

PLoS ONE ◽

10.1371/journal.pone.0262131 ◽

2022 ◽

Vol 17 (1) ◽

pp. e0262131

Author(s):

Adil Aslam Mir ◽

Kimberlee Jane Kearfott ◽

Fatih Vehbi Çelebi ◽

Muhammad Rafique

Keyword(s):

Machine Learning ◽

Time Series Data ◽

Mean Squared Error ◽

Learning Algorithm ◽

Series Data ◽

Machine Learning Method ◽

Learning Method ◽

Imputation Methods ◽

Squared Error ◽

Feature Importance

A new methodology, imputation by feature importance (IBFI), is studied that can be applied to any machine learning method to efficiently fill in any missing or irregularly sampled data. It applies to data missing completely at random (MCAR), missing not at random (MNAR), and missing at random (MAR). IBFI utilizes the feature importance and iteratively imputes missing values using any base learning algorithm. For this work, IBFI is tested on soil radon gas concentration (SRGC) data. XGBoost is used as the learning algorithm and missing data are simulated using R for different missingness scenarios. IBFI is based on the physically meaningful assumption that SRGC depends upon environmental parameters such as temperature and relative humidity. This assumption leads to a model obtained from the complete multivariate series where the controls are available by taking the attribute of interest as a response variable. IBFI is tested against other frequently used imputation methods, namely mean, median, mode, predictive mean matching (PMM), and hot-deck procedures. The performance of the different imputation methods was assessed using root mean squared error (RMSE), mean squared log error (MSLE), mean absolute percentage error (MAPE), percent bias (PB), and mean squared error (MSE) statistics. The imputation process requires more attention when multiple variables are missing in different samples, resulting in challenges to machine learning methods because some controls are missing. IBFI appears to have an advantage in such circumstances. For testing IBFI, Radon Time Series Data (RTS) has been used and data was collected from 1st March 2017 to the 11th of May 2018, including 4 seismic activities that have taken place during the data collection time.

Download Full-text