High Stability Anomaly Detection in Random Environments

The International FLAIRS Conference Proceedings ◽

10.32473/flairs.v34i0.128871 ◽

2021 ◽

Vol 34 ◽

Author(s):

Masaru Ide

Keyword(s):

Machine Learning ◽

Time Series ◽

Anomaly Detection ◽

Autonomous Vehicles ◽

Input Data ◽

Learning Systems ◽

Detection Methods ◽

Random Environments ◽

Pairwise Correlation

We propose anomaly detection to refine input data for predictive machine learning systems. When training, if there are outliers such as spike noises mixed in the input data, the quality of the trained model is deteriorated. The removing such outliers would be expected the service quality of machine learning systems improves such as autonomous vehicles and ship navigation. Conventionally, anomaly detection methods generally require the support of domain experts, and they could not treat with unstable random environments well. We propose a new anomaly detection method, which is highly stable and is capable of treating with random environments without experts. The proposed methods focus on a pairwise correlation between two input time-series, change rates of them are calculated and summarized on a quadrant chart for further analysis. The experiment using an open time-series dataset shows that the proposed methods successfully detect anomalies, and the detected data points are easily illustrated in a human-interpretable way.

Download Full-text

Towards Machine Learning-based Anomaly Detection on Time-Series Data

Infocommunications journal ◽

10.36244/icj.2021.1.5 ◽

2021 ◽

Vol 13 (1) ◽

pp. 35-44

Author(s):

Daniel Vajda ◽

Adrian Pekar ◽

Karoly Farkas

Keyword(s):

Machine Learning ◽

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Short Term Memory ◽

Learning Algorithm ◽

State Of The Art ◽

Detection Methods ◽

Series Data ◽

Rich Information

The complexity of network infrastructures is exponentially growing. Real-time monitoring of these infrastructures is essential to secure their reliable operation. The concept of telemetry has been introduced in recent years to foster this process by streaming time-series data that contain feature-rich information concerning the state of network components. In this paper, we focus on a particular application of telemetry — anomaly detection on time-series data. We rigorously examined state-of-the-art anomaly detection methods. Upon close inspection of the methods, we observed that none of them suits our requirements as they typically face several limitations when applied on time-series data. This paper presents Alter-Re2, an improved version of ReRe, a state-of-the-art Long Short- Term Memory-based machine learning algorithm. Throughout a systematic examination, we demonstrate that by introducing the concepts of ageing and sliding window, the major limitations of ReRe can be overcome. We assessed the efficacy of Alter-Re2 using ten different datasets and achieved promising results. Alter-Re2 performs three times better on average when compared to ReRe.

Download Full-text

OutlierNets: Highly Compact Deep Autoencoder Network Architectures for On-Device Acoustic Anomaly Detection

Sensors ◽

10.3390/s21144805 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4805

Author(s):

Saad Abbasi ◽

Mahmoud Famouri ◽

Mohammad Javad Shafiee ◽

Alexander Wong

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Detection Methods ◽

Detection Accuracy ◽

Network Architectures ◽

Design Exploration ◽

Convolutional Autoencoder ◽

Acoustic Anomaly ◽

Human Operators ◽

Computational Resources

Human operators often diagnose industrial machinery via anomalous sounds. Given the new advances in the field of machine learning, automated acoustic anomaly detection can lead to reliable maintenance of machinery. However, deep learning-driven anomaly detection methods often require an extensive amount of computational resources prohibiting their deployment in factories. Here we explore a machine-driven design exploration strategy to create OutlierNets, a family of highly compact deep convolutional autoencoder network architectures featuring as few as 686 parameters, model sizes as small as 2.7 KB, and as low as 2.8 million FLOPs, with a detection accuracy matching or exceeding published architectures with as many as 4 million parameters. The architectures are deployed on an Intel Core i5 as well as a ARM Cortex A72 to assess performance on hardware that is likely to be used in industry. Experimental results on the model’s latency show that the OutlierNet architectures can achieve as much as 30x lower latency than published networks.

Download Full-text

Machine learning-based climate time series anomaly detection using convolutional neural networks

Weather and Climate ◽

10.2307/27031377 ◽

2020 ◽

Vol 40 (1) ◽

pp. 16

Author(s):

Srinivasan ◽

Wang ◽

Bulleid

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Time Series ◽

Anomaly Detection ◽

Convolutional Neural Networks ◽

Climate Time Series

Download Full-text

Improving Real-Time Drilling Data Quality Using Artificial Intelligence and Machine Learning Techniques

10.2118/204658-ms ◽

2021 ◽

Author(s):

S. H. Al Gharbi ◽

A. A. Al-Majed ◽

A. Abdulraheem ◽

S. Patil ◽

S. M. Elkatatny

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Quality ◽

Real Time ◽

Input Data ◽

Support Vector ◽

The Real ◽

Drilling Data ◽

Drilling Operations

Abstract Due to high demand for energy, oil and gas companies started to drill wells in remote areas and unconventional environments. This raised the complexity of drilling operations, which were already challenging and complex. To adapt, drilling companies expanded their use of the real-time operation center (RTOC) concept, in which real-time drilling data are transmitted from remote sites to companies’ headquarters. In RTOC, groups of subject matter experts monitor the drilling live and provide real-time advice to improve operations. With the increase of drilling operations, processing the volume of generated data is beyond a human's capability, limiting the RTOC impact on certain components of drilling operations. To overcome this limitation, artificial intelligence and machine learning (AI/ML) technologies were introduced to monitor and analyze the real-time drilling data, discover hidden patterns, and provide fast decision-support responses. AI/ML technologies are data-driven technologies, and their quality relies on the quality of the input data: if the quality of the input data is good, the generated output will be good; if not, the generated output will be bad. Unfortunately, due to the harsh environments of drilling sites and the transmission setups, not all of the drilling data is good, which negatively affects the AI/ML results. The objective of this paper is to utilize AI/ML technologies to improve the quality of real-time drilling data. The paper fed a large real-time drilling dataset, consisting of over 150,000 raw data points, into Artificial Neural Network (ANN), Support Vector Machine (SVM) and Decision Tree (DT) models. The models were trained on the valid and not-valid datapoints. The confusion matrix was used to evaluate the different AI/ML models including different internal architectures. Despite the slowness of ANN, it achieved the best result with an accuracy of 78%, compared to 73% and 41% for DT and SVM, respectively. The paper concludes by presenting a process for using AI technology to improve real-time drilling data quality. To the author's knowledge based on literature in the public domain, this paper is one of the first to compare the use of multiple AI/ML techniques for quality improvement of real-time drilling data. The paper provides a guide for improving the quality of real-time drilling data.

Download Full-text

Machine Learning Interpretability: A Survey on Methods and Metrics

Electronics ◽

10.3390/electronics8080832 ◽

2019 ◽

Vol 8 (8) ◽

pp. 832 ◽

Cited By ~ 36

Author(s):

Diogo V. Carvalho ◽

Eduardo M. Pereira ◽

Jaime S. Cardoso

Keyword(s):

Machine Learning ◽

Social Impact ◽

Research Field ◽

Learning Systems ◽

Future Directions ◽

The Past ◽

Current State ◽

Black Boxes ◽

Interpretable Models

Machine learning systems are becoming increasingly ubiquitous. These systems’s adoption has been expanding, accelerating the shift towards a more algorithmic society, meaning that algorithmically informed decisions have greater potential for significant social impact. However, most of these accurate decision support systems remain complex black boxes, meaning their internal logic and inner workings are hidden to the user and even experts cannot fully understand the rationale behind their predictions. Moreover, new regulations and highly regulated domains have made the audit and verifiability of decisions mandatory, increasing the demand for the ability to question, understand, and trust machine learning systems, for which interpretability is indispensable. The research community has recognized this interpretability problem and focused on developing both interpretable models and explanation methods over the past few years. However, the emergence of these methods shows there is no consensus on how to assess the explanation quality. Which are the most suitable metrics to assess the quality of an explanation? The aim of this article is to provide a review of the current state of the research field on machine learning interpretability while focusing on the societal impact and on the developed methods and metrics. Furthermore, a complete literature review is presented in order to identify future directions of work on this field.

Download Full-text

Towards Accurate Run-Time Hardware-Assisted Stealthy Malware Detection: A Lightweight, Yet Effective Time Series CNN-Based Approach

Cryptography ◽

10.3390/cryptography5040028 ◽

2021 ◽

Vol 5 (4) ◽

pp. 28

Author(s):

Hossein Sayadi ◽

Yifeng Gao ◽

Hosein Mohammadi Makrani ◽

Jessica Lin ◽

Paulo Cesar Costa ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

State Of The Art ◽

Malware Detection ◽

Detection Performance ◽

Malicious Code ◽

Detection Methods ◽

Series Data ◽

Run Time

According to recent security analysis reports, malicious software (a.k.a. malware) is rising at an alarming rate in numbers, complexity, and harmful purposes to compromise the security of modern computer systems. Recently, malware detection based on low-level hardware features (e.g., Hardware Performance Counters (HPCs) information) has emerged as an effective alternative solution to address the complexity and performance overheads of traditional software-based detection methods. Hardware-assisted Malware Detection (HMD) techniques depend on standard Machine Learning (ML) classifiers to detect signatures of malicious applications by monitoring built-in HPC registers during execution at run-time. Prior HMD methods though effective have limited their study on detecting malicious applications that are spawned as a separate thread during application execution, hence detecting stealthy malware patterns at run-time remains a critical challenge. Stealthy malware refers to harmful cyber attacks in which malicious code is hidden within benign applications and remains undetected by traditional malware detection approaches. In this paper, we first present a comprehensive review of recent advances in hardware-assisted malware detection studies that have used standard ML techniques to detect the malware signatures. Next, to address the challenge of stealthy malware detection at the processor’s hardware level, we propose StealthMiner, a novel specialized time series machine learning-based approach to accurately detect stealthy malware trace at run-time using branch instructions, the most prominent HPC feature. StealthMiner is based on a lightweight time series Fully Convolutional Neural Network (FCN) model that automatically identifies potentially contaminated samples in HPC-based time series data and utilizes them to accurately recognize the trace of stealthy malware. Our analysis demonstrates that using state-of-the-art ML-based malware detection methods is not effective in detecting stealthy malware samples since the captured HPC data not only represents malware but also carries benign applications’ microarchitectural data. The experimental results demonstrate that with the aid of our novel intelligent approach, stealthy malware can be detected at run-time with 94% detection performance on average with only one HPC feature, outperforming the detection performance of state-of-the-art HMD and general time series classification methods by up to 42% and 36%, respectively.

Download Full-text

Variance error of multi-classification based anomaly detection for time series data

Journal of Computational Methods in Sciences and Engineering ◽

10.3233/jcm-204699 ◽

2020 ◽

pp. 1-16

Author(s):

Baoquan Wang ◽

Tonghai Jiang ◽

Xi Zhou ◽

Bo Ma ◽

Fan Zhao ◽

...

Keyword(s):

Neural Network ◽

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Short Term Memory ◽

Computational Cost ◽

Reconstruction Error ◽

Detection Methods ◽

Series Data ◽

Data Set

For abnormal detection of time series data, the supervised anomaly detection methods require labeled data. While the range of outlier factors used by the existing semi-supervised methods varies with data, model and time, the threshold for determining abnormality is difficult to obtain, in addition, the computational cost of the way to calculate outlier factors from other data points in the data set is also very large. These make such methods difficult to practically apply. This paper proposes a framework named LSTM-VE which uses clustering combined with visualization method to roughly label normal data, and then uses the normal data to train long short-term memory (LSTM) neural network for semi-supervised anomaly detection. The variance error (VE) of the normal data category classification probability sequence is used as outlier factor. The framework enables anomaly detection based on deep learning to be practically applied and using VE avoids the shortcomings of existing outlier factors and gains a better performance. In addition, the framework is easy to expand because the LSTM neural network can be replaced with other classification models. Experiments on the labeled and real unlabeled data sets prove that the framework is better than replicator neural networks with reconstruction error (RNN-RS) and has good scalability as well as practicability.

Download Full-text

Forecasting Spatio-Temporal Dynamics on the Land Surface Using Earth Observation Data—A Review

Remote Sensing ◽

10.3390/rs12213513 ◽

2020 ◽

Vol 12 (21) ◽

pp. 3513

Author(s):

Jonas Koehler ◽

Claudia Kuenzer

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Time Series ◽

Land Surface ◽

Input Data ◽

Earth Observation ◽

Series Data ◽

Climate Projections ◽

Auto Regressive ◽

Spatio Temporal

Reliable forecasts on the impacts of global change on the land surface are vital to inform the actions of policy and decision makers to mitigate consequences and secure livelihoods. Geospatial Earth Observation (EO) data from remote sensing satellites has been collected continuously for 40 years and has the potential to facilitate the spatio-temporal forecasting of land surface dynamics. In this review we compiled 143 papers on EO-based forecasting of all aspects of the land surface published in 16 high-ranking remote sensing journals within the past decade. We analyzed the literature regarding research focus, the spatial scope of the study, the forecasting method applied, as well as the temporal and technical properties of the input data. We categorized the identified forecasting methods according to their temporal forecasting mechanism and the type of input data. Time-lagged regressions which are predominantly used for crop yield forecasting and approaches based on Markov Chains for future land use and land cover simulation are the most established methods. The use of external climate projections allows the forecasting of numerical land surface parameters up to one hundred years into the future, while auto-regressive time series modeling can account for intra-annual variances. Machine learning methods have been increasingly used in all categories and multivariate modeling that integrates multiple data sources appears to be more popular than univariate auto-regressive modeling despite the availability of continuously expanding time series data. Regardless of the method, reliable EO-based forecasting requires high-level remote sensing data products and the resulting computational demand appears to be the main reason that most forecasts are conducted only on a local scale. In the upcoming years, however, we expect this to change with further advances in the field of machine learning, the publication of new global datasets, and the further establishment of cloud computing for data processing.

Download Full-text

Approaches for Monitoring the Energy Consumption with Machine Learning Methods

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.805.79 ◽

2015 ◽

Vol 805 ◽

pp. 79-85

Author(s):

Christian Gebbe ◽

Johannes Glasschröder ◽

Gunther Reinhart

Keyword(s):

Machine Learning ◽

Energy Consumption ◽

Anomaly Detection ◽

Energy Demand ◽

Sensor Data ◽

Detection Methods ◽

Energy Costs ◽

Threshold Values ◽

Production Methods ◽

Static Threshold

In times of rising energy costs and increasing customer awareness of sustainable production methods, many manufacturers take measures to reduce their energy consumption. However, after the realization of such activities the energy demand often tends to increase again due to e.g. leaks, clogged filters, defect valves or suboptimal parameter settings. In order to prevent this, it is necessary to quickly identify such increases by continuously monitoring the energy consumption and counteracting accordingly. Currently, the monitoring is either performed manually or by setting static threshold values. The manual control can be time consuming for large amounts of sensor data. By setting static threshold values only a fraction of the inefficiencies are disclosed. Another option is to use anomaly detection methods from the area of machine learning, which compare the actual sensor values with the expected ones. In this paper an overview about existing anomaly detection methods, which can be applied for this purpose, is presented.

Download Full-text

Autonomous Vehicular Corridor using Artificial Intelligence

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35272 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 1254-1258

Author(s):

Aravind R Kashyap

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Machine Learning ◽

Time Series ◽

Collision Avoidance ◽

Autonomous Vehicles ◽

Human Intervention ◽

Macroscopic Level ◽

Operational Impact

This project considers the operational impact of Autonomous Vehicles by creating a corridor using the latest network available. The behaviour of these vehicles entering the corridor is monitored at the macroscopic level by modifying the data which can be extracted from the vehicle. This data is made to learn using machine learning called the Time Series Neural Network and the data is used as a parameter to make the vehicles Autonomous. The project resolves the location, develops and demonstrates the collision avoidance of the vehicles using Artificial Intelligence. Autonomous means the vehicles will be able to learn to act accordingly without human intervention

Download Full-text