Evaluation of Recurrent Neural Network and its Variants for Intrusion Detection System (IDS)

Author(s):  
R Vinayakumar ◽  
K.P. Soman ◽  
Prabaharan Poornachandran

This article describes how sequential data modeling is a relevant task in Cybersecurity. Sequences are attributed temporal characteristics either explicitly or implicitly. Recurrent neural networks (RNNs) are a subset of artificial neural networks (ANNs) which have appeared as a powerful, principle approach to learn dynamic temporal behaviors in an arbitrary length of large-scale sequence data. Furthermore, stacked recurrent neural networks (S-RNNs) have the potential to learn complex temporal behaviors quickly, including sparse representations. To leverage this, the authors model network traffic as a time series, particularly transmission control protocol / internet protocol (TCP/IP) packets in a predefined time range with a supervised learning method, using millions of known good and bad network connections. To find out the best architecture, the authors complete a comprehensive review of various RNN architectures with its network parameters and network structures. Ideally, as a test bed, they use the existing benchmark Defense Advanced Research Projects Agency / Knowledge Discovery and Data Mining (DARPA) / (KDD) Cup ‘99' intrusion detection (ID) contest data set to show the efficacy of these various RNN architectures. All the experiments of deep learning architectures are run up to 1000 epochs with a learning rate in the range [0.01-0.5] on a GPU-enabled TensorFlow and experiments of traditional machine learning algorithms are done using Scikit-learn. Experiments of families of RNN architecture achieved a low false positive rate in comparison to the traditional machine learning classifiers. The primary reason is that RNN architectures are able to store information for long-term dependencies over time-lags and to adjust with successive connection sequence information. In addition, the effectiveness of RNN architectures are shown for the UNSW-NB15 data set.

Author(s):  
R Vinayakumar ◽  
K.P. Soman ◽  
Prabaharan Poornachandran

This article describes how sequential data modeling is a relevant task in Cybersecurity. Sequences are attributed temporal characteristics either explicitly or implicitly. Recurrent neural networks (RNNs) are a subset of artificial neural networks (ANNs) which have appeared as a powerful, principle approach to learn dynamic temporal behaviors in an arbitrary length of large-scale sequence data. Furthermore, stacked recurrent neural networks (S-RNNs) have the potential to learn complex temporal behaviors quickly, including sparse representations. To leverage this, the authors model network traffic as a time series, particularly transmission control protocol / internet protocol (TCP/IP) packets in a predefined time range with a supervised learning method, using millions of known good and bad network connections. To find out the best architecture, the authors complete a comprehensive review of various RNN architectures with its network parameters and network structures. Ideally, as a test bed, they use the existing benchmark Defense Advanced Research Projects Agency / Knowledge Discovery and Data Mining (DARPA) / (KDD) Cup ‘99' intrusion detection (ID) contest data set to show the efficacy of these various RNN architectures. All the experiments of deep learning architectures are run up to 1000 epochs with a learning rate in the range [0.01-0.5] on a GPU-enabled TensorFlow and experiments of traditional machine learning algorithms are done using Scikit-learn. Experiments of families of RNN architecture achieved a low false positive rate in comparison to the traditional machine learning classifiers. The primary reason is that RNN architectures are able to store information for long-term dependencies over time-lags and to adjust with successive connection sequence information. In addition, the effectiveness of RNN architectures are shown for the UNSW-NB15 data set.


2021 ◽  
pp. 36-43
Author(s):  
L. A. Demidova ◽  
A. V. Filatov

The article considers an approach to solving the problem of monitoring and classifying the states of hard disks, which is solved on a regular basis, within the framework of the concept of non-destructive testing. It is proposed to solve this problem by developing a classification model using machine learning algorithms, in particular, using recurrent neural networks with Simple RNN, LSTM and GRU architectures. To develop a classification model, a data set based on the values of SMART sensors installed on hard disks it used. It represents a group of multidimensional time series. At the same time, the structure of the classification model contains two layers of a neural network with one of the recurrent architectures, as well as a Dropout layer and a Dense layer. The results of experimental studies confirming the advantages of LSTM and GRU architectures as part of hard disk state classification models are presented.


2018 ◽  
Vol 210 ◽  
pp. 04019 ◽  
Author(s):  
Hyontai SUG

Recent world events in go games between human and artificial intelligence called AlphaGo showed the big advancement in machine learning technologies. While AlphaGo was trained using real world data, AlphaGo Zero was trained using massive random data, and the fact that AlphaGo Zero won AlphaGo completely revealed that diversity and size in training data is important for better performance for the machine learning algorithms, especially in deep learning algorithms of neural networks. On the other hand, artificial neural networks and decision trees are widely accepted machine learning algorithms because of their robustness in errors and comprehensibility respectively. In this paper in order to prove that diversity and size in data are important factors for better performance of machine learning algorithms empirically, the two representative algorithms are used for experiment. A real world data set called breast tissue was chosen, because the data set consists of real numbers that is very good property for artificial random data generation. The result of the experiment proved the fact that the diversity and size of data are very important factors for better performance.


IJOSTHE ◽  
2018 ◽  
Vol 5 (6) ◽  
pp. 7
Author(s):  
Apoorva Deshpande ◽  
Ramnaresh Sharma

Anomaly detection system plays an important role in network security. Anomaly detection or intrusion detection model is a predictive model used to predict the network data traffic as normal or intrusion. Machine Learning algorithms are used to build accurate models for clustering, classification and prediction. In this paper classification and predictive models for intrusion detection are built by using machine learning classification algorithms namely Random Forest. These algorithms are tested with KDD-99 data set. In this research work the model for anomaly detection is based on normalized reduced feature and multilevel ensemble classifier. The work is performed in divided into two stages. In the first stage data is normalized using mean normalization. In second stage genetic algorithm is used to reduce number of features and further multilevel ensemble classifier is used for classification of data into different attack groups. From result analysis it is analysed that with reduced feature intrusion can be classified more efficiently.


Author(s):  
Pratik Hopal ◽  
Alkesh Kothar ◽  
Swamini Pimpale ◽  
Pratiksha More ◽  
Jaydeep Patil

The election procedure is one of the most essential processes to take place in a democracy. Even though there have been immense technological advancements, the process of election has been highly limited. Most of the election procedures have been performed using ballot boxes which is an old process and needs to be updated. The security of such practices is also a concern as the identification of the voters is being done manually by the election officers. This process also needs an improvement to increase accuracy and reduce human errors by automating the process. Therefore, for this purpose, this research article analyzes the previous researches on this paradigm. This allows an effective understanding of the machine learning algorithms that are used for automatic facial recognition in the E-voting systems. This paper comes to the conclusion that the Recurrent Neural Networks are best suited for such an application for facial recognition. The future editions of this research will elaborate more on the proposed system in detail.


Author(s):  
Junyi Wang ◽  
Qinggang Meng ◽  
Peng Shang ◽  
Mohamad Saada

This paper focuses on road surface real-time detection by using tripod dolly equipped with Raspberry Pi 3 B+, MPU 9250, which is convenient to collect road surface data and realize real-time road surface detection. Firstly, six kinds of road surfaces data are collected by utilizing Raspberry Pi 3 B+ and MPU 9250. Secondly, the classifiers can be obtained by adopting several machine learning algorithms, recurrent neural networks (RNN) and long short-term memory (LSTM) neural networks. Among the machine learning classifiers, gradient boosting decision tree has the highest accuracy rate of 97.92%, which improves by 29.52% compared with KNN with the lowest accuracy rate of 75.60%. The accuracy rate of LSTM neural networks is 95.31%, which improves by 2.79% compared with RNN with the accuracy rate of 92.52%. Finally, the classifiers are embedded into the Raspberry Pi to detect the road surface in real time, and the detection time is about one second. This road surface detection system could be used in wheeled robot-car and guiding the robot-car to move smoothly.


Author(s):  
Guilherme Loriato Potratz ◽  
Smith Washington Arauco Canchumuni ◽  
Jose David Bermudez Castro ◽  
Júlia Potratz ◽  
Marco Aurélio C. Pacheco

One of the critical processes in the exploration of hydrocarbons is the identification and prediction of lithofacies that constitute the reservoir. One of the cheapest and most efficient ways to carry out that process is from the interpretation of well log data, which are often obtained continuously and in the majority of drilled wells. The main methodologies used to correlate log data to data obtained in well cores are based on statistical analyses, machine learning models and artificial neural networks. This study aims to test an algorithm of dimension reduction of data together with an unsupervised classification method of predicting lithofacies automatically. The performance of the methodology presented was compared to predictions made with artificial neural networks. We used the t-Distributed Stochastic Neighbor Embedding (t-SNE) as an algorithm for mapping the wells logging data in a smaller feature space. Then, the predictions of facies are performed using a KNN algorithm. The method is assessed in the public dataset of the Hugoton and Panoma fields. Prediction of facies through traditional artificial neural networks obtained an accuracy of 69%, where facies predicted through the t-SNE + K-NN algorithm obtained an accuracy of 79%. Considering the nature of the data, which have high dimensionality and are not linearly correlated, the efficiency of t SNE+KNN can be explained by the ability of the algorithm to identify hidden patterns in a fuzzy boundary in data set. It is important to stress that the application of machine learning algorithms offers relevant benefits to the hydrocarbon exploration sector, such as identifying hidden patterns in high-dimensional datasets, searching for complex and non-linear relationships, and avoiding the need for a preliminary definition of mathematic relations among the model’s input data.


Author(s):  
Bart Mak ◽  
Bülent Düz

Abstract For operations at sea it is important to have a good estimate of the current local sea state. Often, sea state information comes from wave buoys or weather forecasts. Sometimes wave radars are used. These sources are not always available or reliable. Being able to reliably use ship motions to estimate sea state characteristics reduces the dependency on external and/or expensive sources. In this paper, we present a method to estimate sea state characteristics from time series of 6-DOF ship motions using machine learning. The available data consists of ship motion and wave scanning radar measurements recorded for a period of two years on a frigate type vessel. The research focused on estimating the relative wave direction, since this is most difficult to estimate using traditional methods. Time series are well suited as input, since the phase differences between motion signals hold the information relevant for this case. This type of input data requires machine learning algorithms that can capture both the relation between the input channels and the time dependence. To this end, convolutional neural networks (CNN) and recurrent neural networks (RNN) are adopted in this study for multivariate time series regression. The results show that the estimation of the relative wave direction is acceptable, assuming that the data set is large enough and covers enough sea states. Investigating the chronological properties of the data set, it turned out that this is not yet the case. The paper will include discussions on how to interpret the results and how to treat temporal data in a more general sense.


To provide security to internet assets, Intrusion Detection System (IDS) is most essential constituent. Due to various network attacks it is very hard to detect malicious activities from remote user as well as remote machines. In such a manner it is mandatory to analyze such activities which are normal or malicious. Due to insufficient background knowledge of system it is hard to detect malicious activities of system. In this work we proposed intrusion detection system using various soft computing algorithms, the system has categorized into three different sections, in first section we execute the data preprocessing as well as generate background knowledge of system according to two training data set as well as combination genetic algorithm. Once the background knowledge has generated system executes for prevention mode. In prevention mode basically it works for defense mechanism from various networks and host attacks. System uses two data sets which contain around 42 attributes. The system is able to support for NIDS as well as HIDS respectively. The result section will show how proposed system is better than classical machine learning algorithms. With the help of various comparative graphs as well as detection rate of systems we conclude proposed system provides the drastic supervision in vulnerable network environment. The average accuracy of proposed system is 100% for DOS attacks as well as around more than 90% plus accuracy for other as well as unknown attacks respectively.


Sign in / Sign up

Export Citation Format

Share Document