Thermal anomaly detection in datacenters

Author(s):  
Yang Yuan ◽  
Eun Kyung Lee ◽  
Dario Pompili ◽  
Junbi Liao

The high density of servers in datacenters generates a large amount of heat, resulting in the high possibility of thermally anomalous events, i.e. computer room air conditioner fan failure, server fan failure, and workload misconfiguration. As such anomalous events increase the cost of maintaining computing and cooling components, they need to be detected, localized, and classified for taking appropriate remedial actions. In this article, a hierarchical neural network framework is proposed to detect small- (server level) and large-scale (datacenter level) thermal anomalies. This novel framework, which is organized into two tiers, analyzes the data sensed by heterogeneous sensors such as sensors built in the servers and external sensors (Telosb). The proposed solution employs a neural network to learn about (a) the relationship among sensing values (i.e. internal, external, and fan speed) and (b) the relationship between the sensing values and workload information. Then, the bottom tier of our framework detects thermal anomalies, whereas the top tier localizes and classifies them. Our solution outperforms other anomaly-detection methods based on regression model, support vector machine, and self-organizing map, as shown by the experimental results.

Author(s):  
Manish Marwah ◽  
Ratnesh K. Sharma ◽  
Wilfredo Lugo

In recent years, there has been a significant growth in number, size and power densities of data centers. A significant part of data center power consumption is attributed to the cooling infrastructure, consisting of computer air conditioning units (CRACs), chillers and cooling towers. For energy efficient operation and management of the cooling resources, data centers are beginning to be extensively instrumented with temperature sensors. While this allows cooling actuators, such as CRAC set point temperature, to be dynamically controlled and data centers operated at higher temperatures to save energy, it also increases chances of thermal anomalies. Furthermore, considering that large data centers can contain thousands to tens of thousands of such sensors, it is virtually impossible to manually inspect and analyze the large volumes of dynamic data generated by these sensors, thus necessitating autonomous mechanisms for thermal anomaly detection. Also, in addition to threshold-based detection methods, other mechanisms of anomaly detection are also necessary. In this paper, we describe the commonly occurring thermal anomalies in a data center. Furthermore, we describe — with examples from a production data center — techniques to autonomously detect these anomalies. In particular, we show the usefulness of a principal component analysis (PCA) based methodology to a large temperature sensor network. Specifically, we examine thermal anomalies such as those related to misconfiguration of equipment, blocked vent tiles, faulty sensor and CRAC related anomalies. Furthermore, several of these anomalies normally go undetected since no temperature thresholds are violated. We present examples of the thermal anomalies and their detection from a real data center.


2017 ◽  
Author(s):  
Zhong-Hu Jiao ◽  
Jing Zhao ◽  
Xinjian Shan

Abstract. Detecting thermal anomalies prior to strong earthquakes is a key in understanding and forecasting earthquake activities because of its recognition of thermal radiation-related phenomena in seismic preparation phases. Data from satellite observations serve as a powerful tool in monitoring earthquake preparation areas at a global scale and in a nearly real-time manner. Over the past several decades, many new different data sources have been utilized in this field, and progressive anomaly detection approaches have been developed. This paper dedicatedly reviews the progress and development of pre-seismic thermal anomaly detection technology in this decade. First, precursor parameters, including parameters from the top of the atmosphere, in the atmosphere, and on the Earth’s surface, are discussed. Second, different anomaly detection methods, which are used to extract thermal anomalous signals that probably indicate future seismic events, are presented. Finally, certain critical problems with the current research are highlighted, and new developing trends and perspectives for future work are discussed. The development of Earth observation satellites and anomaly detection algorithms can enrich available information sources, provide advanced tools for multilevel earthquake monitoring and improve short- and medium-term forecasting, which should play a large and growing role in pre-seismic thermal anomaly research.


2020 ◽  
Author(s):  
Arash Karimi Zarchi ◽  
Mohammad Reza Saradjian Maralan

Abstract. The recent scientific studies in the context of earthquake precursors reveal some processes connected to seismic activity including thermal anomaly before earthquakes which is a great help for making a better decision regarding this disastrous phenomenon and reducing its casualty to a minimum. This paper represents a method for grouping the proper input data for different thermal anomaly detection methods using the land surface temperature (LST) mean in multiple distances from the corresponding fault during the 40 days (i.e. 30 days before and 10 days after impending earthquake) of investigation. Six strong earthquakes with Ms > 6 that have occurred in Iran have been investigated in this study. We used two different approaches for detecting thermal anomalies. They are mean-standard deviation method also known as standard method and interquartile method which is similar to the first method but uses different parameters as input. Most of the studies have considered thermal anomalies around the known epicentre locations where the investigation can only be performed after the earthquake. This study is using fault distance-based approach in predicting the earthquake regarding the location of the faults as the potential area. This could be considered as an important step towards actual prediction of earthquake’s time and intensity. Results show that the proposed input data produces less false alarms in each of the thermal anomaly detection methods compared to the ordinary input data making this method much more accurate and stable considering the easy accessibility of thermal data and their less complicated algorithms for processing. In the final step, the detected anomalies are used for estimating earthquake intensity using Artificial Neural Network (ANN). The results show that estimated intensities of most earthquakes are very close to the actual intensities. Since the location of the active faults are known a priori, using fault distance-based approach may be regarded as a superior method in predicting the impending earthquakes for vulnerable faults. In spite of the previous investigations that the studies were only possible aftermath, the fault distance-based approach can be used as a tool for future unknown earthquakes prediction. However, it is recommended to use thermal anomaly detection as an initial process to be jointly used with other precursors to reduce the number of investigations that require more complicated algorithms and data processing.


Author(s):  
Matheus Gutoski ◽  
Manassés Ribeiro ◽  
Leandro T. Hattori ◽  
Marcelo Romero ◽  
André E. Lazzaretti ◽  
...  

Recent research has shown that features obtained from pretrained Convolutional Neural Network (CNN) models can be promptly applied to a variety of problems they were not originally designed to solve. This concept, often referred to as Transfer Learning (TL), is a common practice when labeled data is limited. In some fields, such as video anomaly detection, TL is still an underexplored subject in the sense that it is not clear whether the architecture of the pretrained CNN model impacts on the video anomaly detection performance. In order to clarify this issue, we perform an extensive benchmark using 12 different pretrained CNN models on ImageNet as feature extractors and apply the features obtained to seven video anomaly detection benchmark datasets. This work presents some interesting findings about video anomaly detection using TL. The highlights of our findings were revealed by our experiments, which have shown that a simple classification process using One-Class Support Vector Machines yields similar results to state-of-the-art models. Moreover, a statistical analysis suggests that architectural differences are negligible when choosing a pretrained model for video anomaly detection, since all models presented similar performance. At last, we present an in-depth visual analysis of the Avenue dataset, and reveal several aspects that may be limiting the performance of state-of-the-art video anomaly detection methods.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Chunbo Liu ◽  
Lanlan Pan ◽  
Zhaojun Gu ◽  
Jialiang Wang ◽  
Yitong Ren ◽  
...  

System logs can record the system status and important events during system operation in detail. Detecting anomalies in the system logs is a common method for modern large-scale distributed systems. Yet threshold-based classification models used for anomaly detection output only two values: normal or abnormal, which lacks probability of estimating whether the prediction results are correct. In this paper, a statistical learning algorithm Venn-Abers predictor is adopted to evaluate the confidence of prediction results in the field of system log anomaly detection. It is able to calculate the probability distribution of labels for a set of samples and provide a quality assessment of predictive labels to some extent. Two Venn-Abers predictors LR-VA and SVM-VA have been implemented based on Logistic Regression and Support Vector Machine, respectively. Then, the differences among different algorithms are considered so as to build a multimodel fusion algorithm by Stacking. And then a Venn-Abers predictor based on the Stacking algorithm called Stacking-VA is implemented. The performances of four types of algorithms (unimodel, Venn-Abers predictor based on unimodel, multimodel, and Venn-Abers predictor based on multimodel) are compared in terms of validity and accuracy. Experiments are carried out on a log dataset of the Hadoop Distributed File System (HDFS). For the comparative experiments on unimodels, the results show that the validities of LR-VA and SVM-VA are better than those of the two corresponding underlying models. Compared with the underlying model, the accuracy of the SVM-VA predictor is better than that of LR-VA predictor, and more significantly, the recall rate increases from 81% to 94%. In the case of experiments on multiple models, the algorithm based on Stacking multimodel fusion is significantly superior to the underlying classifier. The average accuracy of Stacking-VA is larger than 0.95, which is more stable than the prediction results of LR-VA and SVM-VA. Experimental results show that the Venn-Abers predictor is a flexible tool that can make accurate and valid probability predictions in the field of system log anomaly detection.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5895
Author(s):  
Jiansu Pu ◽  
Jingwen Zhang ◽  
Hui Shao ◽  
Tingting Zhang ◽  
Yunbo Rao

The development of the Internet has made social communication increasingly important for maintaining relationships between people. However, advertising and fraud are also growing incredibly fast and seriously affect our daily life, e.g., leading to money and time losses, trash information, and privacy problems. Therefore, it is very important to detect anomalies in social networks. However, existing anomaly detection methods cannot guarantee the correct rate. Besides, due to the lack of labeled data, we also cannot use the detection results directly. In other words, we still need human analysts in the loop to provide enough judgment for decision making. To help experts analyze and explore the results of anomaly detection in social networks more objectively and effectively, we propose a novel visualization system, egoDetect, which can detect the anomalies in social communication networks efficiently. Based on the unsupervised anomaly detection method, the system can detect the anomaly without training and get the overview quickly. Then we explore an ego’s topology and the relationship between egos and alters by designing a novel glyph based on the egocentric network. Besides, it also provides rich interactions for experts to quickly navigate to the interested users for further exploration. We use an actual call dataset provided by an operator to evaluate our system. The result proves that our proposed system is effective in the anomaly detection of social networks.


Sensors ◽  
2020 ◽  
Vol 20 (11) ◽  
pp. 3305 ◽  
Author(s):  
Huogen Wang ◽  
Zhanjie Song ◽  
Wanqing Li ◽  
Pichao Wang

The paper presents a novel hybrid network for large-scale action recognition from multiple modalities. The network is built upon the proposed weighted dynamic images. It effectively leverages the strengths of the emerging Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) based approaches to specifically address the challenges that occur in large-scale action recognition and are not fully dealt with by the state-of-the-art methods. Specifically, the proposed hybrid network consists of a CNN based component and an RNN based component. Features extracted by the two components are fused through canonical correlation analysis and then fed to a linear Support Vector Machine (SVM) for classification. The proposed network achieved state-of-the-art results on the ChaLearn LAP IsoGD, NTU RGB+D and Multi-modal & Multi-view & Interactive ( M 2 I ) datasets and outperformed existing methods by a large margin (over 10 percentage points in some cases).


Author(s):  
SUNG-BAE CHO

Bioinformatics has recently drawn a lot of attention to efficiently analyze biological genomic information with information technology, especially pattern recognition. In this paper, we attempt to explore extensive features and classifiers through a comparative study of the most promising feature selection methods and machine learning classifiers. The gene information from a patient's marrow expressed by DNA microarray, which is either the acute myeloid leukemia or acute lymphoblastic leukemia, is used to predict the cancer class. Pearson's and Spearman's correlation coefficients, Euclidean distance, cosine coefficient, information gain, mutual information and signal to noise ratio have been used for feature selection. Backpropagation neural network, self-organizing map, structure adaptive self-organizing map, support vector machine, inductive decision tree and k-nearest neighbor have been used for classification. Experimental results indicate that backpropagation neural network with Pearson's correlation coefficients produces the best result, 97.1% of recognition rate on the test data.


2014 ◽  
Vol 602-605 ◽  
pp. 2044-2047
Author(s):  
Miao Yan ◽  
Zhi Bao Liu

The large-scale software is consisted of the components which are quite different. The detection accuracy of the traditional faults detection methods for the large-scale component software is not satisfactory. This paper proposes a large-scale software faults detection methods based on improved neural network combining the features of the large-scale software by computing the stable probability and building the neural network faults detection models. The proposed method can analyze the serial faults of the large-scale software to determine the positions of the faults. The experiment and simulation results show that the improved method for large-scale software fault detection can greatly improve the accuracy.


Sign in / Sign up

Export Citation Format

Share Document