MONITORING AND CLASSIFYING THE STATE OF HARD DISKS USING RECURRENT NEURAL NETWORKS

Evaluation of Recurrent Neural Network and its Variants for Intrusion Detection System (IDS)

International Journal of Information System Modeling and Design ◽

10.4018/ijismd.2017070103 ◽

2017 ◽

Vol 8 (3) ◽

pp. 43-63 ◽

Cited By ~ 17

Author(s):

R Vinayakumar ◽

K.P. Soman ◽

Prabaharan Poornachandran

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Intrusion Detection ◽

Recurrent Neural Networks ◽

Machine Learning Algorithms ◽

Time Range ◽

Sequence Information ◽

Time Lags ◽

Data Set ◽

Arbitrary Length

This article describes how sequential data modeling is a relevant task in Cybersecurity. Sequences are attributed temporal characteristics either explicitly or implicitly. Recurrent neural networks (RNNs) are a subset of artificial neural networks (ANNs) which have appeared as a powerful, principle approach to learn dynamic temporal behaviors in an arbitrary length of large-scale sequence data. Furthermore, stacked recurrent neural networks (S-RNNs) have the potential to learn complex temporal behaviors quickly, including sparse representations. To leverage this, the authors model network traffic as a time series, particularly transmission control protocol / internet protocol (TCP/IP) packets in a predefined time range with a supervised learning method, using millions of known good and bad network connections. To find out the best architecture, the authors complete a comprehensive review of various RNN architectures with its network parameters and network structures. Ideally, as a test bed, they use the existing benchmark Defense Advanced Research Projects Agency / Knowledge Discovery and Data Mining (DARPA) / (KDD) Cup ‘99' intrusion detection (ID) contest data set to show the efficacy of these various RNN architectures. All the experiments of deep learning architectures are run up to 1000 epochs with a learning rate in the range [0.01-0.5] on a GPU-enabled TensorFlow and experiments of traditional machine learning algorithms are done using Scikit-learn. Experiments of families of RNN architecture achieved a low false positive rate in comparison to the traditional machine learning classifiers. The primary reason is that RNN architectures are able to store information for long-term dependencies over time-lags and to adjust with successive connection sequence information. In addition, the effectiveness of RNN architectures are shown for the UNSW-NB15 data set.

Download Full-text

ANALYSIS OF THE INFLUENCE OF MACHINE LEARNING ALGORITHM PARAMETERS ON THE RESULTS OF TRAFFIC CLASSIFICATION IN REAL TIME

T-Comm ◽

10.36724/2072-8735-2021-15-9-24-35 ◽

2021 ◽

Vol 15 (9) ◽

pp. 24-35

Author(s):

Irina A. Krasnova ◽

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Time ◽

Experimental Studies ◽

Machine Learning Algorithms ◽

Classification Model ◽

Traffic Classification ◽

Data Set ◽

Minimum Number ◽

The Impact

The paper analyzes the impact of setting the parameters of Machine Learning algorithms on the results of traffic classification in real-time. The Random Forest and XGBoost algorithms are considered. A brief description of the work of both methods and methods for evaluating the results of classification is given. Experimental studies are conducted on a database obtained on a real network, separately for TCP and UDP flows. In order for the results of the study to be used in real time, a special feature matrix is created based on the first 15 packets of the flow. The main parameters of the Random Forest (RF) algorithm for configuration are the number of trees, the partition criterion used, the maximum number of features for constructing the partition function, the depth of the tree, and the minimum number of samples in the node and in the leaf. For XGBoost, the number of trees, the depth of the tree, the minimum number of samples in the leaf, for features, and the percentage of samples needed to build the tree are taken. Increasing the number of trees leads to an increase in accuracy to a certain value, but as shown in the article, it is important to make sure that the model is not overfitted. To combat overfitting, the remaining parameters of the trees are used. In the data set under study, by eliminating overfitting, it was possible to achieve an increase in classification accuracy for individual applications by 11-12% for Random Forest and by 12-19% for XGBoost. The results show that setting the parameters is a very important step in building a traffic classification model, because it helps to combat overfitting and significantly increases the accuracy of the algorithm’s predictions. In addition, it was shown that if the parameters are properly configured, XGBoost, which is not very popular in traffic classification works, becomes a competitive algorithm and shows better results compared to the widespread Random Forest.

Download Full-text

Evaluation of Recurrent Neural Network and its Variants for Intrusion Detection System (IDS)

Deep Learning and Neural Networks ◽

10.4018/978-1-7998-0414-7.ch018 ◽

2020 ◽

pp. 295-316

Author(s):

R Vinayakumar ◽

K.P. Soman ◽

Prabaharan Poornachandran

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Intrusion Detection ◽

Recurrent Neural Networks ◽

Transmission Control Protocol ◽

Machine Learning Algorithms ◽

Test Bed ◽

Data Set ◽

Positive Rate ◽

Run Up

This article describes how sequential data modeling is a relevant task in Cybersecurity. Sequences are attributed temporal characteristics either explicitly or implicitly. Recurrent neural networks (RNNs) are a subset of artificial neural networks (ANNs) which have appeared as a powerful, principle approach to learn dynamic temporal behaviors in an arbitrary length of large-scale sequence data. Furthermore, stacked recurrent neural networks (S-RNNs) have the potential to learn complex temporal behaviors quickly, including sparse representations. To leverage this, the authors model network traffic as a time series, particularly transmission control protocol / internet protocol (TCP/IP) packets in a predefined time range with a supervised learning method, using millions of known good and bad network connections. To find out the best architecture, the authors complete a comprehensive review of various RNN architectures with its network parameters and network structures. Ideally, as a test bed, they use the existing benchmark Defense Advanced Research Projects Agency / Knowledge Discovery and Data Mining (DARPA) / (KDD) Cup ‘99' intrusion detection (ID) contest data set to show the efficacy of these various RNN architectures. All the experiments of deep learning architectures are run up to 1000 epochs with a learning rate in the range [0.01-0.5] on a GPU-enabled TensorFlow and experiments of traditional machine learning algorithms are done using Scikit-learn. Experiments of families of RNN architecture achieved a low false positive rate in comparison to the traditional machine learning classifiers. The primary reason is that RNN architectures are able to store information for long-term dependencies over time-lags and to adjust with successive connection sequence information. In addition, the effectiveness of RNN architectures are shown for the UNSW-NB15 data set.

Download Full-text

Recurrent Neural Networks with Small Weights Implement Definite Memory Machines

Neural Computation ◽

10.1162/08997660360675080 ◽

2003 ◽

Vol 15 (8) ◽

pp. 1897-1929 ◽

Cited By ~ 32

Author(s):

Barbara Hammer ◽

Peter Tiňo

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Experimental Studies ◽

Activation Function ◽

Transition Function ◽

Recurrent Network ◽

Point Of View ◽

Recurrent Networks ◽

Contraction Parameter ◽

Arbitrary Precision

Recent experimental studies indicate that recurrent neural networks initialized with “small” weights are inherently biased toward definite memory machines (Tiňno, Čerňanský, & Beňušková, 2002a, 2002b). This article establishes a theoretical counterpart: transition function of recurrent network with small weights and squashing activation function is a contraction. We prove that recurrent networks with contractive transition function can be approximated arbitrarily well on input sequences of unbounded length by a definite memory machine. Conversely, every definite memory machine can be simulated by a recurrent network with contractive transition function. Hence, initialization with small weights induces an architectural bias into learning with recurrent neural networks. This bias might have benefits from the point of view of statistical learning theory: it emphasizes one possible region of the weight space where generalization ability can be formally proved. It is well known that standard recurrent neural networks are not distribution independent learnable in the probably approximately correct (PAC) sense if arbitrary precision and inputs are considered. We prove that recurrent networks with contractive transition function with a fixed contraction parameter fulfill the so-called distribution independent uniform convergence of empirical distances property and hence, unlike general recurrent networks, are distribution independent PAC learnable.

Download Full-text

Fully Convolutional Deep Neural Networks with Optimized Hyperparameters for Detection of Shockable and Non-Shockable Rhythms

Sensors ◽

10.3390/s20102875 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2875 ◽

Cited By ~ 1

Author(s):

Vessela Krasteva ◽

Sarah Ménétré ◽

Jean-Philippe Didon ◽

Irena Jekova

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Random Search ◽

Computational Cost ◽

Dropout Rate ◽

Machine Learning Algorithms ◽

Dense Layer ◽

Max Pooling ◽

Resuscitation Guidelines ◽

Advisory Systems

Deep neural networks (DNN) are state-of-the-art machine learning algorithms that can be learned to self-extract significant features of the electrocardiogram (ECG) and can generally provide high-output diagnostic accuracy if subjected to robust training and optimization on large datasets at high computational cost. So far, limited research and optimization of DNNs in shock advisory systems is found on large ECG arrhythmia databases from out-of-hospital cardiac arrests (OHCA). The objective of this study is to optimize the hyperparameters (HPs) of deep convolutional neural networks (CNN) for detection of shockable (Sh) and nonshockable (NSh) rhythms, and to validate the best HP settings for short and long analysis durations (2–10 s). Large numbers of (Sh + NSh) ECG samples were used for training (720 + 3170) and validation (739 + 5921) from Holters and defibrillators in OHCA. An end-to-end deep CNN architecture was implemented with one-lead raw ECG input layer (5 s, 125 Hz, 2.5 uV/LSB), configurable number of 5 to 23 hidden layers and output layer with diagnostic probability p ∈ [0: Sh,1: NSh]. The hidden layers contain N convolutional blocks × 3 layers (Conv1D (filters = Fi, kernel size = Ki), max-pooling (pool size = 2), dropout (rate = 0.3)), one global max-pooling and one dense layer. Random search optimization of HPs = {N, Fi, Ki}, i = 1, … N in a large grid of N = [1, 2, … 7], Fi = [5;50], Ki = [5;100] was performed. During training, the model with maximal balanced accuracy BAC = (Sensitivity + Specificity)/2 over 400 epochs was stored. The optimization principle is based on finding the common HPs space of a few top-ranked models and prediction of a robust HP setting by their median value. The optimal models for 1–7 CNN layers were trained with different learning rates LR = [10−5; 10−2] and the best model was finally validated on 2–10 s analysis durations. A number of 4216 random search models were trained. The optimal models with more than three convolutional layers did not exhibit substantial differences in performance BAC = (99.31–99.5%). Among them, the best model was found with {N = 5, Fi = {20, 15, 15, 10, 5}, Ki = {10, 10, 10, 10, 10}, 7521 trainable parameters} with maximal validation performance for 5-s analysis (BAC = 99.5%, Se = 99.6%, Sp = 99.4%) and tolerable drop in performance (<2% points) for very short 2-s analysis (BAC = 98.2%, Se = 97.6%, Sp = 98.7%). DNN application in future-generation shock advisory systems can improve the detection performance of Sh and NSh rhythms and can considerably shorten the analysis duration complying with resuscitation guidelines for minimal hands-off pauses.

Download Full-text

A Spectral Feature Based Convolutional Neural Network for Classification of Sea Surface Oil Spill

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8040160 ◽

2019 ◽

Vol 8 (4) ◽

pp. 160 ◽

Cited By ~ 11

Author(s):

Bingxin Liu ◽

Ying Li ◽

Guannan Li ◽

Anling Liu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

Machine Learning Algorithms ◽

Classification Model ◽

Support Vector ◽

Oil Film ◽

One Dimensional

Spectral characteristics play an important role in the classification of oil film, but the presence of too many bands can lead to information redundancy and reduced classification accuracy. In this study, a classification model that combines spectral indices-based band selection (SIs) and one-dimensional convolutional neural networks was proposed to realize automatic oil films classification using hyperspectral remote sensing images. Additionally, for comparison, the minimum Redundancy Maximum Relevance (mRMR) was tested for reducing the number of bands. The support vector machine (SVM), random forest (RF), and Hu’s convolutional neural networks (CNN) were trained and tested. The results show that the accuracy of classifications through the one dimensional convolutional neural network (1D CNN) models surpassed the accuracy of other machine learning algorithms such as SVM and RF. The model of SIs+1D CNN could produce a relatively higher accuracy oil film distribution map within less time than other models.

Download Full-text

Application of Improved U-Net and ResU-Net Based Semantic Segmentation Method for Digitization of Analog Seismograms

10.5194/egusphere-egu2020-4294 ◽

2020 ◽

Author(s):

Jiahua Zhao ◽

Miaki Ishii ◽

Hiromi Ishii ◽

Thomas Lee

Keyword(s):

Time Series ◽

Scientific Discovery ◽

Semantic Segmentation ◽

Machine Learning Algorithms ◽

Classification Model ◽

Segmentation Method ◽

Data Set ◽

Time Mark ◽

Scanned Images ◽

Digital Time

<p>Analog seismograms contain rich and valuable information over nearly a century. However, these analog seismic records are difficult to analyze quantitatively using modern techniques that require digital time series. At the same time, because these seismograms are deteriorating with age and need substantial storage space, their future has become uncertain. Conversion of the analog seismograms to digital time series will allow more conventional access and storage of the data as well as making them available for exciting scientific discovery. The digitization software, DigitSeis, reads a scanned image of a seismogram and generates digitized and timed traces, but the initial step of recognizing trace and time mark segments, as well as other features such as hand-written notes, within the image poses certain challenges. Armed with manually processed analyses of image classification, we aim to automate this process using machine learning algorithms. The semantic segmentation methods have made breakthroughs in many fields. In order to solve the problem of accurate classification of scanned images for analog seismograms, we develop and test an improved deep convolutional neural network based on U-Net, Improved U-Net, and a deeper network segmentation method that adds the residual blocks, ResU-Net. There are two segmentation objects are the traces and time marks in scanned images, and the goal is to train a binary classification model for each type of segmentation object, i.e., there are two models, one for trace objects and another for time mark objects, for each of the neural networks. The networks are trained on the 300 images of the digitizated results of analog seismograms from Harvard-Adam Dziewo&#324;ski Observatory from 1939. Application of the algorithms to a test data set results in the pixel accuracy (PA) for the Improved U-Net of 95% for traces and nearly 100% for time marks, with Intersection over Union (IoU) of 79% and 75% for traces and time marks, respectively. The PA of ResU-Net are 97% and nearly 100% for traces and time marks, with IoU of 83% and 74%. These experiments show that Improved U-Net is more effective for semantic segmentation of time marks, while ResU-Net is more suitable for traces. In general, both network models work well in separating and identifying objects, and provide a significant step forward in nearly automating digitizing analog seismograms.</p>

Download Full-text

Experimental studies of memory surfaces and learning surfaces in recurrent neural networks

Systems and Computers in Japan ◽

10.1002/scj.4690250803 ◽

1994 ◽

Vol 25 (8) ◽

pp. 27-39

Author(s):

Tatsumi Watanabe ◽

Yoshiki Uchikawa ◽

Kazutoshi Gouhara

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Experimental Studies

Download Full-text

Recurrent Neural Networks for Predicting Outcomes after Liver Transplantation: Representing Temporal Sequence of Clinical Observations

Methods of Information in Medicine ◽

10.1055/s-0038-1634197 ◽

2001 ◽

Vol 40 (05) ◽

pp. 386-391 ◽

Cited By ~ 10

Author(s):

H. R. Doyle ◽

B. Parmanto

Keyword(s):

Neural Networks ◽

Time Series ◽

Liver Transplant ◽

Recurrent Neural Networks ◽

Current Model ◽

Time Series Prediction ◽

Data Set ◽

Test Set ◽

Clinical Observations ◽

Learning Set

Summary Objectives: This paper investigates a version of recurrent neural network with the backpropagation through time (BPTT) algorithm for predicting liver transplant graft failure based on a time series sequence of clinical observations. The objective is to improve upon the current approaches to liver transplant outcome prediction by developing a more complete model that takes into account not only the preoperative risk assessment, but also the early postoperative history. Methods: A 6-fold cross-validation procedure was used to measure the performance of the networks. The data set was divided into a learning set and a test set by maintaining the same proportion of positive and negative cases in the original set. The effects of network complexity on overfitting were investigated by constructing two types of networks with different numbers of hidden units. For each type of network, 10 individual networks were trained on the learning set and used to form a committee. The performance of the networks was measured exhaustively with respect to both the entire training and test sets. Results: The networks were capable of learning the time series problem and achieved good performances of 90% correct classification on the learning set and 78% on the test set. The prediction accuracy increases as more information becomes progressively available after the operation with the daily improvement of 10% on the learning set and 5% on the test set. Conclusions: Recurrent neural networks trained with BPTT algorithm are capable of learning to represent temporal behavior of the time series prediction task. This model is an improvement upon the current model that does not take into account postoperative temporal information.

Download Full-text

API Call-Based Malware Classification Using Recurrent Neural Networks

Journal of Cyber Security and Mobility ◽

10.13052/jcsm2245-1439.1036 ◽

2021 ◽

Author(s):

Chen Li ◽

Junjun Zheng

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Time Series Data ◽

Short Term Memory ◽

Application Programming Interface ◽

Classification Model ◽

Series Data ◽

Process Time ◽

Economic Damage ◽

Malware Classification

Malicious software, called malware, can perform harmful actions on computer systems, which may cause economic damage and information leakage. Therefore, malware classification is meaningful and required to prevent malware attacks. Application programming interface (API) call sequences are easily observed and are good choices as features for malware classification. However, one of the main issues is how to generate a suitable feature for the algorithms of classification to achieve a high classification accuracy. Different malware sample brings API call sequence with different lengths, and these lengths may reach millions, which may cause computation cost and time complexities. Recurrent neural networks (RNNs) is one of the most versatile approaches to process time series data, which can be used to API call-based Malware calssification. In this paper, we propose a malware classification model with RNN, especially the long short-term memory (LSTM) and the gated recurrent unit (GRU), to classify variants of malware by using long-sequences of API calls. In numerical experiments, a benchmark dataset is used to illustrate the proposed approach and validate its accuracy. The numerical results show that the proposed RNN model works well on the malware classification.

Download Full-text