scholarly journals Visualization of Data Cubes for Anomaly Detection in Network Traffic Data Streams

Author(s):  
Volker Ahlers ◽  
Tim Laue ◽  
Nils Wellermann ◽  
Felix Heine
2018 ◽  
Vol 14 (11) ◽  
pp. 155014771881447 ◽  
Author(s):  
Xiaoling Tao ◽  
Yang Peng ◽  
Feng Zhao ◽  
Peichao Zhao ◽  
Yong Wang

With the rapid development of large-scale complex networks and proliferation of various social network applications, the amount of network traffic data generated is increasing tremendously, and efficient anomaly detection on those massive network traffic data is crucial to many network applications, such as malware detection, load balancing, network intrusion detection. Although there are many methods around for network traffic anomaly detection, they are all designed for single machine, failing to deal with the case that the network traffic data are so large that it is prohibitive for a single computer to store and process the data. To solve these problems, we propose a parallel algorithm based on Isolation Forest and Spark for network traffic anomaly detection. We combine the advantages of Isolation Forest algorithm in network traffic anomaly detection and big data processing capability of Spark technology. Meanwhile, we apply the idea of parallelization to the process of modeling and evaluation. In the calculation process, by assigning tasks to multiple compute nodes, Isolation Forest and Spark can efficiently perform anomaly detection and evaluation process. By this way, we can also solve the problem of computation bottleneck on single machine. Extensive experiments on real world datasets show that our Isolation Forest and Spark is efficient and scales well for anomaly detection on large network traffic data.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Nouar AlDahoul ◽  
Hezerul Abdul Karim ◽  
Abdulaziz Saleh Ba Wazir

AbstractNetwork Anomaly Detection is still an open challenging task that aims to detect anomalous network traffic for security purposes. Usually, the network traffic data are large-scale and imbalanced. Additionally, they have noisy labels. This paper addresses the previous challenges and utilizes million-scale and highly imbalanced ZYELL’s dataset. We propose to train deep neural networks with class weight optimization to learn complex patterns from rare anomalies observed from the traffic data. This paper proposes a novel model fusion that combines two deep neural networks including binary normal/attack classifier and multi-attacks classifier. The proposed solution can detect various network attacks such as Distributed Denial of Service (DDOS), IP probing, PORT probing, and Network Mapper (NMAP) probing. The experiments conducted on a ZYELL’s real-world dataset show promising performance. It was found that the proposed approach outperformed the baseline model in terms of average macro Fβ score and false alarm rate by 17% and 5.3%, respectively.


2021 ◽  
Vol 12 (3) ◽  
pp. 1-28
Author(s):  
Makiya Nakashima ◽  
Alex Sim ◽  
Youngsoo Kim ◽  
Jonghyun Kim ◽  
Jinoh Kim

Variable selection (also known as feature selection ) is essential to optimize the learning complexity by prioritizing features, particularly for a massive, high-dimensional dataset like network traffic data. In reality, however, it is not an easy task to effectively perform the feature selection despite the availability of the existing selection techniques. From our initial experiments, we observed that the existing selection techniques produce different sets of features even under the same condition (e.g., a static size for the resulted set). In addition, individual selection techniques perform inconsistently, sometimes showing better performance but sometimes worse than others, thereby simply relying on one of them would be risky for building models using the selected features. More critically, it is demanding to automate the selection process, since it requires laborious efforts with intensive analysis by a group of experts otherwise. In this article, we explore challenges in the automated feature selection with the application of network anomaly detection. We first present our ensemble approach that benefits from the existing feature selection techniques by incorporating them, and one of the proposed ensemble techniques based on greedy search works highly consistently showing comparable results to the existing techniques. We also address the problem of when to stop to finalize the feature elimination process and present a set of methods designed to determine the number of features for the reduced feature set. Our experimental results conducted with two recent network datasets show that the identified feature sets by the presented ensemble and stopping methods consistently yield comparable performance with a smaller number of features to conventional selection techniques.


Author(s):  
V. I. Dubrovin ◽  
◽  
B. V. Petryk ◽  
G. V. Nelasa ◽  
◽  
...  

Network traffic data analysis is very important for detecting DOS attacks and malicious anomalies. Many data mining techniques have been found to manage data and use it for security purposes. Fast and accurate search for content-based queries is critical to making such numerous data streams useful. This paper proposes an analysis of the deauthentication attack and the localization of the anomaly data by the wavelet transform method.


Author(s):  
Wasim Ahmed Ali ◽  
Manasa K N ◽  
Mohammed Aljunid ◽  
Malika Bendechache ◽  
P. Sandhya

Due to the advance in network technologies, the number of network users is growing rapidly, which leads to the generation of large network traffic data. This large network traffic data is prone to attacks and intrusions. Therefore, the network needs to be secured and protected by detecting anomalies as well as to prevent intrusions into networks. Network security has gained attention from researchers and network laboratories. In this paper, a comprehensive survey was completed to give a broad perspective of what recently has been done in the area of anomaly detection. Newly published studies in the last five years have been investigated to explore modern techniques with future opportunities. In this regard, the related literature on anomaly detection systems in network traffic has been discussed, with a variety of typical applications such as WSNs, IoT, high-performance computing, industrial control systems (ICS), and software-defined network (SDN) environments. Finally, we underlined diverse open issues to improve the detection of anomaly systems.


2021 ◽  
Vol 5 (4) ◽  
pp. 1-26
Author(s):  
Md Tahmid Rahman Laskar ◽  
Jimmy Xiangji Huang ◽  
Vladan Smetana ◽  
Chris Stewart ◽  
Kees Pouw ◽  
...  

Industrial Information Technology infrastructures are often vulnerable to cyberattacks. To ensure security to the computer systems in an industrial environment, it is required to build effective intrusion detection systems to monitor the cyber-physical systems (e.g., computer networks) in the industry for malicious activities. This article aims to build such intrusion detection systems to protect the computer networks from cyberattacks. More specifically, we propose a novel unsupervised machine learning approach that combines the K-Means algorithm with the Isolation Forest for anomaly detection in industrial big data scenarios. Since our objective is to build the intrusion detection system for the big data scenario in the industrial domain, we utilize the Apache Spark framework to implement our proposed model that was trained in large network traffic data (about 123 million instances of network traffic) stored in Elasticsearch. Moreover, we evaluate our proposed model on the live streaming data and find that our proposed system can be used for real-time anomaly detection in the industrial setup. In addition, we address different challenges that we face while training our model on large datasets and explicitly describe how these issues were resolved. Based on our empirical evaluation in different use cases for anomaly detection in real-world network traffic data, we observe that our proposed system is effective to detect anomalies in big data scenarios. Finally, we evaluate our proposed model on several academic datasets to compare with other models and find that it provides comparable performance with other state-of-the-art approaches.


Author(s):  
Guangjun Wu ◽  
Zhihui Zhao ◽  
Ge Fu ◽  
Haiping Wang ◽  
Yong Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document