scholarly journals A Hybrid Approach for the Analysis of Feature Selection using Information Gain and BAT Techniques on The Anomaly Detection

Author(s):  
Dr. Joel Sunny Deol Gosu, Dr. Pullagura Priyadarsini, Ravi Kanth Motupalli

Every day, millions of people in many institutions communicate with each other on the Internet. The past two decades have witnessed unprecedented levels of Internet use by people around the world. Almost alongside these rapid developments in the internet space, an ever increasing incidence of attacks carried out on the internet has been consistently reported every minute. In such a difficult environment, Anomaly Detection Systems (ADS) play an important role in monitoring and analyzing daily internet activities for security breaches and threats. However, the analytical data routinely generated from computer networks are usually of enormous size and of little use. This creates a major challenge for ADSs, who must examine all the functionality of a certain dataset to identify intrusive patterns. The selection of features is an important factor in modeling anomaly-based intrusion detection systems. An irrelevant characteristic can lead to overfitting which in turn negatively affects the modeling power of classification algorithms. The objective of this study is to analyze and select the most discriminating input characteristics for the construction of efficient and computationally efficient schemes for an ADS. In the first step, a heuristic algorithm called IG-BA is proposed for dimensionality reduction by selecting the optimal subset based on the concept of entropy. Then, the relevant and meaningful features are selected, before implementing Number of Classifiers which includes: (1) An irrelevant feature can lead to overfitting which in turn negatively affects the modeling power of the classification algorithms. Experiment was done on CICIDS-2017 dataset by applying (1) Random Forest (RF), (2) Bayes Network (BN), (3) Naive Bayes (NB), (4) J48 and (5) Random Tree (RT) with results showing better detection precision and faster execution time. The proposed heuristic algorithm outperforms the existing ones as it is more accurate in detection as well as faster. However, Random Forest algorithm emerges as the best classifier for feature selection technique and scores over others by virtue of its accuracy in optimal selection of features.

Author(s):  
A. M. Bagirov ◽  
A. M. Rubinov ◽  
J. Yearwood

The feature selection problem involves the selection of a subset of features that will be sufficient for the determination of structures or clusters in a given dataset and in making predictions. This chapter presents an algorithm for feature selection, which is based on the methods of optimization. To verify the effectiveness of the proposed algorithm we applied it to a number of publicly available real-world databases. The results of numerical experiments are presented and discussed. These results demonstrate that the algorithm performs well on the datasets considered.


2019 ◽  
Vol 16 (8) ◽  
pp. 3603-3607 ◽  
Author(s):  
Shraddha Khonde ◽  
V. Ulagamuthalvi

Considering current network scenario hackers and intruders has become a big threat today. As new technologies are emerging fast, extensive use of these technologies and computers, what plays an important role is security. Most of the computers in network can be easily compromised with attacks. Big issue of concern is increase in new type of attack these days. Security to the sensitive data is very big threat to deal with, it need to consider as high priority issue which should be addressed immediately. Highly efficient Intrusion Detection Systems (IDS) are available now a days which detects various types of attacks on network. But we require the IDS which is intelligent enough to detect and analyze all type of new threats on the network. Maximum accuracy is expected by any of this intelligent intrusion detection system. An Intrusion Detection System can be hardware or software that analyze and monitors all activities of network to detect malicious activities happened inside the network. It also informs and helps administrator to deal with malicious packets, which if enters in network can harm more number of computers connected together. In our work we have implemented an intellectual IDS which helps administrator to analyze real time network traffic. IDS does it by classifying packets entering into the system as normal or malicious. This paper mainly focus on techniques used for feature selection to reduce number of features from KDD-99 dataset. This paper also explains algorithm used for classification i.e., Random Forest which works with forest of trees to classify real time packet as normal or malicious. Random forest makes use of ensembling techniques to give final output which is derived by combining output from number of trees used to create forest. Dataset which is used while performing experiments is KDD-99. This dataset is used to train all trees to get more accuracy with help of random forest. From results achieved we can observe that random forest algorithm gives more accuracy in distributed network with reduced false alarm rate.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 776
Author(s):  
Marcin Niemiec ◽  
Rafał Kościej ◽  
Bartłomiej Gdowski

The Internet is an inseparable part of our contemporary lives. This means that protection against threats and attacks is crucial for major companies and for individual users. There is a demand for the ongoing development of methods for ensuring security in cyberspace. A crucial cybersecurity solution is intrusion detection systems, which detect attacks in network environments and responds appropriately. This article presents a new multivariable heuristic intrusion detection algorithm based on different types of flags and values of entropy. The data is shared by organisations to help increase the effectiveness of intrusion detection. The authors also propose default values for parameters of a heuristic algorithm and values regarding detection thresholds. This solution has been implemented in a well-known, open-source system and verified with a series of tests. Additionally, the authors investigated how updating the variables affects the intrusion detection process. The results confirmed the effectiveness of the proposed approach and heuristic algorithm.


In every part of the world, there is tremendous growth in digital literacy in the present era. People are trying to access internet-based applications with the use of digital machines. As a result, the internet has become a primary requirement for everyone, and most business transactions often take place conveniently across the network. On the other hand, intruders involved in making intrusions and doing activities such as capturing passwords, compromise on the route, collecting details of credit cards, etc. Many malicious activities are taking place over the network due to this intruding activity on the internet. Applications such as host-based Intrusion Detection System (IDS) and network-based IDS have previously been used to control network intruders. Mostly when they come with Encrypted packets, spoofed network ids, these techniques were not able to control intruders promisingly. It is essential to examine these types of attacks periodically to identify patterns of recent attacks. In this paper, the authors have proposed a model based on deep learning by using the NSL – KDD dataset to solve these problems. For later train, the model with data with a random forest classifier algorithm, the principal component analysis applied for feature selection. The model is designed to detect patterns of intruders effectively using the knowledge gained from training data. To detect malicious patterns over the network, the model shows a sufficient accuracy of around 90 percent.


2021 ◽  
Vol 5 (2) ◽  
pp. 415
Author(s):  
Firdausi Nuzula Zamzami ◽  
Adiwijaya Adiwijaya ◽  
Mahendra Dwifebri P

Information exchange is currently the most happening on the internet. Information exchange can be done in many ways, such as expressing expressions on social media. One of them is reviewing a film. When someone reviews a film he will use his emotions to express their feelings, it can be positive or negative. The fast growth of the internet has made information more diverse, plentiful and unstructured. Sentiment analysis can handle this, because sentiment analysis is a classification process to understand opinions, interactions, and emotions of a document or text that is carried out automatically by a computer system. One suitable machine learning method is the Modified Balanced Random Forest. To deal with the various data, the feature selection used is Mutual Information. With these two methods, the system is able to produce an accuracy value of 79% and F1-scores value of 75%.


2019 ◽  
Vol 13 (2) ◽  
pp. 142-147
Author(s):  
Srishti Sharma ◽  
Yogita Gigras ◽  
Rita Chhikara ◽  
Anuradha Dhull

Background: Intrusion detection systems are responsible for detecting anomalies and network attacks. Building of an effective IDS depends upon the readily available dataset. This dataset is used to train and test intelligent IDS. In this research, NSL KDD dataset (an improvement over original KDD Cup 1999 dataset) is used as KDD’99 contains huge amount of redundant records, which makes it difficult to process the data accurately. Methods: The classification techniques applied on this dataset to analyze the data are decision trees like J48, Random Forest and Random Trees. Results: On comparison of these three classification algorithms, Random Forest was proved to produce the best results and therefore, Random Forest classification method was used to further analyze the data. The results are analyzed and depicted in this paper with the help of feature/attribute selection by applying all the possible combinations. Conclusion: There are total of eight significant attributes selected after applying various attribute selection methods on NSL KDD dataset.


Entropy ◽  
2016 ◽  
Vol 18 (2) ◽  
pp. 44 ◽  
Author(s):  
Nantian Huang ◽  
Guobo Lu ◽  
Guowei Cai ◽  
Dianguo Xu ◽  
Jiafeng Xu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document