A K-means algorithm based on characteristics of density applied to network intrusion detection

K-means algorithms are a group of popular unsupervised algorithms widely used for cluster analysis. However, the results of traditional K-means clustering algorithms are greatly affected by the initial clustering center, with unstable accuracy and low speed, which makes the algorithm hard to meet the requirements for Big Data. In this paper, a modernized version of the K-means algorithm based on density to select the initial seed of clustering is proposed. Firstly, Kd-tree is used to divide the hyper-rectangle space, so those points close to each other are grouped into the same sub-tree during data pre-processing, and the generalized information is stored in the tree structure. Besides, an improved Kd-tree nearest neighbor search is used in the K-means algorithm to prune the search space and optimize the operation for speedup. The clustering results show that the clusters are stable and accurate when the numbers of clusters and iterations are constant. Experimental results in the network intrusion detection case show that the improved version of the K-means algorithms performs better in terms of detection rate and false rate.

Download Full-text

Network Intrusion Detection and Attack Analysis Based on SOFM with Fast Nearest-Neighbor Search

Journal of Computer Research and Development ◽

10.1360/crad20050919 ◽

2005 ◽

Vol 42 (9) ◽

pp. 1578

Author(s):

Jun Zheng

Keyword(s):

Intrusion Detection ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Network Intrusion Detection ◽

Neighbor Search ◽

Network Intrusion

Download Full-text

A Feature Selection Approach for Network Intrusion Detection Based on Tree-Seed Algorithm and K-Nearest Neighbor

2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS-SWS) ◽

10.1109/idaacs-sws.2018.8525522 ◽

2018 ◽

Cited By ~ 3

Author(s):

Feng Chen ◽

Zhiwei Ye ◽

Chunzhi Wang ◽

Lingyu Yan ◽

Ruoxi Wang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Nearest Neighbor ◽

Network Intrusion Detection ◽

K Nearest Neighbor ◽

Network Intrusion ◽

Selection Approach ◽

Feature Selection Approach ◽

Tree Seed

Download Full-text

CLUSTERING-BASED NETWORK INTRUSION DETECTION

International Journal of Reliability Quality and Safety Engineering ◽

10.1142/s0218539307002568 ◽

2007 ◽

Vol 14 (02) ◽

pp. 169-187 ◽

Cited By ~ 55

Author(s):

SHI ZHONG ◽

TAGHI M. KHOSHGOFTAAR ◽

NAEEM SELIYA

Keyword(s):

Data Mining ◽

Network Security ◽

Intrusion Detection ◽

Unsupervised Learning ◽

Network Traffic ◽

Clustering Algorithms ◽

Network Intrusion Detection ◽

Learning Methods ◽

High Detection Rate ◽

Network Intrusion

Recently data mining methods have gained importance in addressing network security issues, including network intrusion detection — a challenging task in network security. Intrusion detection systems aim to identify attacks with a high detection rate and a low false alarm rate. Classification-based data mining models for intrusion detection are often ineffective in dealing with dynamic changes in intrusion patterns and characteristics. Consequently, unsupervised learning methods have been given a closer look for network intrusion detection. We investigate multiple centroid-based unsupervised clustering algorithms for intrusion detection, and propose a simple yet effective self-labeling heuristic for detecting attack and normal clusters of network traffic audit data. The clustering algorithms investigated include, k-means, Mixture-Of-Spherical Gaussians, Self-Organizing Map, and Neural-Gas. The network traffic datasets provided by the DARPA 1998 offline intrusion detection project are used in our empirical investigation, which demonstrates the feasibility and promise of unsupervised learning methods for network intrusion detection. In addition, a comparative analysis shows the advantage of clustering-based methods over supervised classification techniques in identifying new or unseen attack types.

Download Full-text

A Combination Strategy of Feature Selection Based on an Integrated Optimization Algorithm and Weighted K-Nearest Neighbor to Improve the Performance of Network Intrusion Detection

Electronics ◽

10.3390/electronics9081206 ◽

2020 ◽

Vol 9 (8) ◽

pp. 1206

Author(s):

Hui Xu ◽

Krzysztof Przystupa ◽

Ce Fang ◽

Andrzej Marciniak ◽

Orest Kochan ◽

...

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Optimization Algorithm ◽

Nearest Neighbor ◽

Original Data ◽

Network Intrusion Detection ◽

K Nearest Neighbor ◽

Combination Strategy ◽

Integrated Optimization ◽

Network Intrusion

With the widespread use of the Internet, network security issues have attracted more and more attention, and network intrusion detection has become one of the main security technologies. As for network intrusion detection, the original data source always has a high dimension and a large amount of data, which greatly influence the efficiency and the accuracy. Thus, both feature selection and the classifier then play a significant role in raising the performance of network intrusion detection. This paper takes the results of classification optimization of weighted K-nearest neighbor (KNN) with those of the feature selection algorithm into consideration, and proposes a combination strategy of feature selection based on an integrated optimization algorithm and weighted KNN, in order to improve the performance of network intrusion detection. Experimental results show that the weighted KNN can increase the efficiency at the expense of a small amount of the accuracy. Thus, the proposed combination strategy of feature selection based on an integrated optimization algorithm and weighted KNN can then improve both the efficiency and the accuracy of network intrusion detection.

Download Full-text

Dimension Reduction and its Effects on Clustering for Intrusion Detection

Privacy, Intrusion Detection and Response ◽

10.4018/978-1-60960-836-1.ch007 ◽

2011 ◽

pp. 169-192

Author(s):

Peyman Kabiri ◽

Ali Ghorbani

Keyword(s):

Intrusion Detection ◽

Dimension Reduction ◽

Clustering Algorithms ◽

Feature Space ◽

Information Value ◽

Network Intrusion Detection ◽

The Past ◽

Network Intrusion ◽

Network Intrusions ◽

Network Feature

With recent advances in network based technology and the increased dependency of our every day life on this technology, assuring reliable operation of network based systems is very important. During recent years, a number of attacks on networks have dramatically increased and consequently interest in network intrusion detection has increased among the researchers. During the past few years, different approaches for collecting a dataset of network features, each with its own assumptions, have been proposed to detect network intrusions. Recently, many research works have been focused on better understanding of the network feature space so that they can come up with a better detection method. The curse of dimensionality is still a big obstacle in front of the researchers in network intrusion detection. In this chapter, DARPA’99 dataset is used for the study. Features in that dataset are analyzed with respect to their information value. Using the information value of the features, the number of dimensions in the data is reduced. Later on, using several clustering algorithms, effects of the dimension reduction on the dataset are studied and the results are reported.

Download Full-text

Symmetry Based Automatic Evolution of Clusters: A New Approach to Data Clustering

Computational Intelligence and Neuroscience ◽

10.1155/2015/796276 ◽

2015 ◽

Vol 2015 ◽

pp. 1-21 ◽

Cited By ~ 2

Author(s):

Singh Vijendra ◽

Sahoo Laxman

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

A Priori ◽

Clustering Algorithms ◽

Real Data ◽

Nearest Neighbor Search ◽

Objective Functions ◽

Neighbor Search ◽

Genetic Clustering ◽

Symmetric Points

We present a multiobjective genetic clustering approach, in which data points are assigned to clusters based on new line symmetry distance. The proposed algorithm is called multiobjective line symmetry based genetic clustering (MOLGC). Two objective functions, first the Davies-Bouldin (DB) index and second the line symmetry distance based objective functions, are used. The proposed algorithm evolves near-optimal clustering solutions using multiple clustering criteria, without a priori knowledge of the actual number of clusters. The multiple randomizedKdimensional (Kd) trees based nearest neighbor search is used to reduce the complexity of finding the closest symmetric points. Experimental results based on several artificial and real data sets show that proposed clustering algorithm can obtain optimal clustering solutions in terms of different cluster quality measures in comparison to existing SBKM and MOCK clustering algorithms.

Download Full-text

Network Intrusion Detection with Threat Agent Profiling

Security and Communication Networks ◽

10.1155/2018/3614093 ◽

2018 ◽

Vol 2018 ◽

pp. 1-17 ◽

Cited By ~ 7

Author(s):

Tomáš Bajtoš ◽

Andrej Gajdoš ◽

Lenka Kleinová ◽

Katarína Lučivjanská ◽

Pavol Sokol

Keyword(s):

Network Security ◽

Intrusion Detection ◽

Network Traffic ◽

Clustering Algorithms ◽

Computer Systems ◽

Network Intrusion Detection ◽

Clustering Methods ◽

Security Incident ◽

Network Intrusion ◽

Fine Classification

With the increase in usage of computer systems and computer networks, the problem of intrusion detection in network security has become an important issue. In this paper, we discuss approaches that simplify network administrator’s work. We applied clustering methods for security incident profiling. We considerK-means, PAM, and CLARA clustering algorithms. For this purpose, we used data collected in Warden system from various security tools. We do not aim to differentiate between normal and abnormal network traffic, but we focus on grouping similar threat agents based on attributes of security events. We suggest a case of a fine classification and a case of a coarse classification and discuss advantages of both cases.

Download Full-text

NBC: An Efficient Hierarchical Clustering Algorithm for Large Datasets

International Journal of Semantic Computing ◽

10.1142/s1793351x15400085 ◽

2015 ◽

Vol 09 (03) ◽

pp. 307-331 ◽

Cited By ~ 1

Author(s):

Wei Zhang ◽

Gongxuan Zhang ◽

Yongli Wang ◽

Zhaomeng Zhu ◽

Tao Li

Keyword(s):

Hierarchical Clustering ◽

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Large Datasets ◽

Nearest Neighbor Search ◽

Large Dataset ◽

Neighbor Search ◽

Hierarchical Clustering Algorithm

Nearest neighbor search is a key technique used in hierarchical clustering and its computing complexity decides the performance of the hierarchical clustering algorithm. The time complexity of standard agglomerative hierarchical clustering is O(n3), while the time complexity of more advanced hierarchical clustering algorithms (such as nearest neighbor chain, SLINK and CLINK) is O(n2). This paper presents a new nearest neighbor search method called nearest neighbor boundary (NNB), which first divides a large dataset into independent subset and then finds nearest neighbor of each point in subset. When NNB is used, the time complexity of hierarchical clustering can be reduced to O(n log 2n). Based on NNB, we propose a fast hierarchical clustering algorithm called nearest-neighbor boundary clustering (NBC), and the proposed algorithm can be adapted to the parallel and distributed computing framework. The experimental results demonstrate that our algorithm is practical for large datasets.

Download Full-text

LNNLS-KH: A Feature Selection Method for Network Intrusion Detection

Security and Communication Networks ◽

10.1155/2021/8830431 ◽

2021 ◽

Vol 2021 ◽

pp. 1-22

Author(s):

Xin Li ◽

Peng Yi ◽

Wei Wei ◽

Yiming Jiang ◽

Le Tian

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Classification Accuracy ◽

Nearest Neighbor ◽

False Positive Rate ◽

Optimal Solution ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Krill Herd ◽

Positive Rate

As an important part of intrusion detection, feature selection plays a significant role in improving the performance of intrusion detection. Krill herd (KH) algorithm is an efficient swarm intelligence algorithm with excellent performance in data mining. To solve the problem of low efficiency and high false positive rate in intrusion detection caused by increasing high-dimensional data, an improved krill swarm algorithm based on linear nearest neighbor lasso step (LNNLS-KH) is proposed for feature selection of network intrusion detection. The number of selected features and classification accuracy are introduced into fitness evaluation function of LNNLS-KH algorithm, and the physical diffusion motion of the krill individuals is transformed by a nonlinear method. Meanwhile, the linear nearest neighbor lasso step optimization is performed on the updated krill herd position in order to derive the global optimal solution. Experiments show that the LNNLS-KH algorithm retains 7 features in NSL-KDD dataset and 10.2 features in CICIDS2017 dataset on average, which effectively eliminates redundant features while ensuring high detection accuracy. Compared with the CMPSO, ACO, KH, and IKH algorithms, it reduces features by 44%, 42.86%, 34.88%, and 24.32% in NSL-KDD dataset, and 57.85%, 52.34%, 27.14%, and 25% in CICIDS2017 dataset, respectively. The classification accuracy increased by 10.03% and 5.39%, and the detection rate increased by 8.63% and 5.45%. Time of intrusion detection decreased by 12.41% and 4.03% on average. Furthermore, LNNLS-KH algorithm quickly jumps out of the local optimal solution and shows good performance in the optimal fitness iteration curve, convergence speed, and false positive rate of detection.

Download Full-text