Intrusion detection system (IDS) security model is
successfully utilized in static, distributed and dynamic network
environments. Generally IDS needs a classification method for the
decision of normal and abnormal events. This classification task is
based on a set of features and massive amount of samples.
However, all the features do not contribute the same level of
prediction during classification. Hence feature selection (FS) has
to be done before classification to select best features. Random
forest (RF) does the dual role of FS and classification.
Experiments have been done to prove that RF is the best classifier
among other machine learning (ML) algorithms such as SVM
classifier and C5.0 decision tree algorithm. However, the default
parameter values of RF are not well suited for distributed
environments such as cloud. It leads to poor accuracy and less
efficiency in intrusion detection since enormous events have to be
analyzed. So the parameters of RF have to be optimized by an
efficient method. The important parameters of RF are number of
trees, maximum depth of a tree, sample size, number of features
considered to split a node (Mtry), node size and maximum leaf
nodes. Among these parameters the hyper parameters are selected
based on three decision factors, randomness; split rule; tree
complexity. The issues to be considered during parameter tuning
are to avoid over-fitting and under-fitting. Therefore Simulated
Annealing (SA) is utilized for tuning these hyper parameters of
RF which leads to improve detection accuracy and efficiency of
IDS. The idea of using SA for parameter optimizing process is to
avoid those issues since it never struck in local optimum. The
proposed system significantly boosts the results of IDS. The
efficiency of the proposed SA-RF is validated using CICIDS2017
dataset.