Host-based Intrusion Detection Systems
(HIDS) automatically detect events that indicate compromise by adversarial applications. HIDS are generally formulated as analyses of sequences of system events such as bash commands or system calls.
Anomaly-based
approaches to HIDS leverage models of normal (aka baseline) system behavior to detect and report abnormal events, and have the advantage of being able to detect novel attacks. In this paper we develop a new method for anomaly-based HIDS using deep learning predictions of sequence-to-sequence behavior in system calls. Our proposed method, called the
ALAD
algorithm, aggregates predictions at the
application
level to detect anomalies. We investigate the use of several deep learning architectures, including WaveNet and several recurrent networks. We show that
ALAD
empowered with deep learning significantly outperforms previous approaches. We train and evaluate our models using an existing dataset, ADFA-LD, and a new dataset of our own construction, PLAID. As deep learning models are black box in nature we use an alternate approach, allotaxonographs, to characterize and understand differences in baseline vs.~attack sequences in HIDS datasets such as PLAID.