Early Detection of Cyber Attacks Based on Feature Selection Algorithm

The world is turning out to be progressively digitalized raising security concerns and the urgent requirement for strong and propelled security advancements and systems to battle the expanding complex nature of digital assaults. This paper talks about how machine learning is being utilized in digital security in resistance and offense exercises, remembering conversations for digital assaults focused at machine learning models. In this review, we are proposing a scientific categorization of IDS, which considers information protests to be essential measurements to group and condense IDS Literature based on machine learning and based on profound knowledge. The review explains initially the idea and scientific grade of IDSs. Machine learning calculations are presented at that point for the many time used in IDSs, measurements and presented benchmark datasets. Next, we take the proposed ordered framework as a benchmark in conjunction with the agent writing and explain how to understand key IDS issues with machine learning and profound systems. At long last, difficulties and future advancements are talked about by assessing ongoing agent examines. This paper proposes IDS dependent on highlight determination and bunching calculation utilizing channel and wrapper techniques. Channel and wrapper strategies are named include gathering dependent on direct connection coefficient (FGLCC) calculation and cuttlefish calculation (CFA), separately.

Download Full-text

Towards a Lightweight Detection System for Cyber Attacks in the IoT Environment Using Corresponding Features

Electronics ◽

10.3390/electronics9010144 ◽

2020 ◽

Vol 9 (1) ◽

pp. 144 ◽

Cited By ~ 6

Author(s):

Yan Naung Soe ◽

Yaokai Feng ◽

Paulus Insap Santosa ◽

Rudy Hartanto ◽

Kouichi Sakurai

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Detection System ◽

Detection Performance ◽

Cyber Attacks ◽

Raspberry Pi ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

New Feature ◽

Iot Devices

The application of a large number of Internet of Things (IoT) devices makes our life more convenient and industries more efficient. However, it also makes cyber-attacks much easier to occur because so many IoT devices are deployed and most of them do not have enough resources (i.e., computation and storage capacity) to carry out ordinary intrusion detection systems (IDSs). In this study, a lightweight machine learning-based IDS using a new feature selection algorithm is designed and implemented on Raspberry Pi, and its performance is verified using a public dataset collected from an IoT environment. To make the system lightweight, we propose a new algorithm for feature selection, called the correlated-set thresholding on gain-ratio (CST-GR) algorithm, to select really necessary features. Because the feature selection is conducted on three specific kinds of cyber-attacks, the number of selected features can be significantly reduced, which makes the classifiers very small and fast. Thus, our detection system is lightweight enough to be implemented and carried out in a Raspberry Pi system. More importantly, as the really necessary features corresponding to each kind of attack are exploited, good detection performance can be expected. The performance of our proposal is examined in detail with different machine learning algorithms, in order to learn which of them is the best option for our system. The experiment results indicate that the new feature selection algorithm can select only very few features for each kind of attack. Thus, the detection system is lightweight enough to be implemented in the Raspberry Pi environment with almost no sacrifice on detection performance.

Download Full-text

Rough Set-hypergraph-based Feature Selection Approach for Intrusion Detection Systems

Defence Science Journal ◽

10.14429/dsj.66.10802 ◽

2016 ◽

Vol 66 (6) ◽

pp. 612 ◽

Cited By ~ 14

Author(s):

M.R. Gauthama Raman ◽

K. Kannan ◽

S.K. Pal ◽

V. S. Shankar Sriram

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Rough Set ◽

Network Architecture ◽

Cyber Attacks ◽

Intrusion Detection Systems ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Detection Systems ◽

Helly Property

Immense growth in network-based services had resulted in the upsurge of internet users, security threats and cyber-attacks. Intrusion detection systems (IDSs) have become an essential component of any network architecture, in order to secure an IT infrastructure from the malicious activities of the intruders. An efficient IDS should be able to detect, identify and track the malicious attempts made by the intruders. With many IDSs available in the literature, the most common challenge due to voluminous network traffic patterns is the curse of dimensionality. This scenario emphasizes the importance of feature selection algorithm, which can identify the relevant features and ignore the rest without any information loss. In this paper, a novel rough set κ-Helly property technique (RSKHT) feature selection algorithm had been proposed to identify the key features for network IDSs. Experiments carried using benchmark KDD cup 1999 dataset were found to be promising, when compared with the existing feature selection algorithms with respect to reduct size, classifier’s performance and time complexity. RSKHT was found to be computationally attractive and flexible for massive datasets.

Download Full-text

A NOVEL FEATURE SELECTION ALGORITHM WITH SUPERVISED MUTUAL INFORMATION FOR CLASSIFICATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013500279 ◽

2013 ◽

Vol 22 (04) ◽

pp. 1350027

Author(s):

JAGANATHAN PALANICHAMY ◽

KUPPUCHAMY RAMASAMY

Keyword(s):

Machine Learning ◽

Data Mining ◽

Feature Selection ◽

Mutual Information ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Class A ◽

Selection Algorithms ◽

The Relationship ◽

Class Variable

Feature selection is essential in data mining and pattern recognition, especially for database classification. During past years, several feature selection algorithms have been proposed to measure the relevance of various features to each class. A suitable feature selection algorithm normally maximizes the relevancy and minimizes the redundancy of the selected features. The mutual information measure can successfully estimate the dependency of features on the entire sampling space, but it cannot exactly represent the redundancies among features. In this paper, a novel feature selection algorithm is proposed based on maximum relevance and minimum redundancy criterion. The mutual information is used to measure the relevancy of each feature with class variable and calculate the redundancy by utilizing the relationship between candidate features, selected features and class variables. The effectiveness is tested with ten benchmarked datasets available in UCI Machine Learning Repository. The experimental results show better performance when compared with some existing algorithms.

Download Full-text

Machine Learning Based Clinical Diagnosis of Liver Patients with Instance Replacement

Journal of Mobile Multimedia ◽

10.13052/jmm1550-4646.1827 ◽

2021 ◽

Author(s):

J. V. D. Prasad ◽

A. Raghuvira Pratap ◽

Babu Sallagundla

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Research Work ◽

Feature Selection Method ◽

Learning Model ◽

Disease Classification ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Huge Data ◽

Machine Learning Model

With the rapid increase in number of clinical data and hence the prediction and analysing data becomes very difficult. With the help of various machine learning models, it becomes easy to work on these huge data. A machine learning model faces lots of challenges; one among the challenge is feature selection. In this research work, we propose a novel feature selection method based on statistical procedures to increase the performance of the machine learning model. Furthermore, we have tested the feature selection algorithm in liver disease classification dataset and the results obtained shows the efficiency of the proposed method.

Download Full-text

Classification of Diabetes using Random Forest with Feature Selection Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3595.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 1295-1300 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Electronic Health Records ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Health Records

Diabetes has become a serious problem now a day. So there is a need to take serious precautions to eradicate this. To eradicate, we should know the level of occurrence. In this project we predict the level of occurrence of diabetes. We predict the level of occurrence of diabetes using Random Forest, a Machine Learning Algorithm. Using the patient’s Electronic Health Records (EHR) we can build accurate models that predict the presence of diabetes.

Download Full-text

Dominant Feature Selection and Machine Learning-Based Hybrid Approach to Analyze Android Ransomware

Security and Communication Networks ◽

10.1155/2021/7035233 ◽

2021 ◽

Vol 2021 ◽

pp. 1-22

Author(s):

Tanya Gera ◽

Jaiteg Singh ◽

Abolfazl Mehbodniya ◽

Julian L. Webber ◽

Mohammad Shabaz ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Hybrid Approach ◽

Machine Learning Algorithms ◽

Dynamic Monitoring ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Detection Techniques ◽

Dominant Feature

Ransomware is a special malware designed to extort money in return for unlocking the device and personal data files. Smartphone users store their personal as well as official data on these devices. Ransomware attackers found it bewitching for their financial benefits. The financial losses due to ransomware attacks are increasing rapidly. Recent studies witness that out of 87% reported cyber-attacks, 41% are due to ransomware attacks. The inability of application-signature-based solutions to detect unknown malware has inspired many researchers to build automated classification models using machine learning algorithms. Advanced malware is capable of delaying malicious actions on sensing the emulated environment and hence posing a challenge to dynamic monitoring of applications also. Existing hybrid approaches utilize a variety of features combination for detection and analysis. The rapidly changing nature and distribution strategies are possible reasons behind the deteriorated performance of primitive ransomware detection techniques. The limitations of existing studies include ambiguity in selecting the features set. Increasing the feature set may lead to freedom of adept attackers against learning algorithms. In this work, we intend to propose a hybrid approach to identify and mitigate Android ransomware. This study employs a novel dominant feature selection algorithm to extract the dominant feature set. The experimental results show that our proposed model can differentiate between clean and ransomware with improved precision. Our proposed hybrid solution confirms an accuracy of 99.85% with zero false positives while considering 60 prominent features. Further, it also justifies the feature selection algorithm used. The comparison of the proposed method with the existing frameworks indicates its better performance.

Download Full-text

An Embedded-Based Weighted Feature Selection Algorithm for Classifying Web Document

Wireless Communications and Mobile Computing ◽

10.1155/2020/8879054 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

G. Siva Shankar ◽

P. Ashokkumar ◽

R. Vinayakumar ◽

Uttam Ghosh ◽

Wathiq Mansoor ◽

...

Keyword(s):

Search Engine ◽

Side Information ◽

Classification Model ◽

Web Pages ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Web Document ◽

Exponential Increase ◽

Benchmark Datasets ◽

Optimal Classification

With the exponential increase in a number of web pages daily, it makes it very difficult for a search engine to list relevant web pages. In this paper, we propose a machine learning-based classification model that can learn the best features in each web page and helps in search engine listing. The existing methods for listing have lots of drawbacks like interfacing the normal operations of the website and crawling lots of useless information. Our proposed algorithm provides an optimal classification for websites which has a large number of web pages such as Wikipedia by just considering core information like link text, side information, and header text. We implemented our algorithm with standard benchmark datasets, and the results show that our algorithm outperforms the existing algorithms.

Download Full-text

LSTM Neural Networks for Detecting Anomalies Caused by Web Application Cyber Attacks

10.3233/faia210014 ◽

2021 ◽

Author(s):

Igor Kotenko ◽

Oleg Lauta ◽

Kseniya Kribel ◽

Igor Saenko

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Web Application ◽

High Efficiency ◽

Short Term Memory ◽

Cyber Attacks ◽

Long Short Term Memory ◽

The Many ◽

Complex Architecture ◽

Network Anomalies

Detecting anomalies in the traffic of computer networks is an important step in protecting and countering various types of cyber attacks. Among the many methods and approaches for detecting anomalies in network traffic, the most popular are machine learning methods that allow one to achieve high accuracy with minimal errors. One of the ways to improve the efficiency of anomaly detection using machine learning is the use of artificial neural networks of complex architecture, in particular, networks with long short-term memory (LSTM), which have demonstrated high efficiency in many areas. The paper is devoted to the study of the capabilities of LSTM neural networks for detecting network anomalies. It proposes using LSTM neural networks to detect network anomalies caused by cyber attacks to bypass Web Application Firewall vulnerabilities that are very difficult to detect by other means. For this purpose, it is proposed to use LSTM in conjunction with an autoencoder. The issues of software implementation of the proposed approach are considered. The experimental results obtained using the generated dataset confirmed the high efficiency of the developed approach. Experiments have shown that the proposed approach allows detecting cyber attacks in real or near real time.

Download Full-text