Analysis of Effects of Image Format on Detection Performance and Resource Usage in CNN-Based Malware Detection

2021 ◽  
Vol 21 (4) ◽  
pp. 69-75
Author(s):  
Seong-hyeon Byeon ◽  
◽  
Young-won Kim ◽  
Kwan-seob Ko ◽  
Soo-jin Lee
Cryptography ◽  
2021 ◽  
Vol 5 (4) ◽  
pp. 28
Author(s):  
Hossein Sayadi ◽  
Yifeng Gao ◽  
Hosein Mohammadi Makrani ◽  
Jessica Lin ◽  
Paulo Cesar Costa ◽  
...  

According to recent security analysis reports, malicious software (a.k.a. malware) is rising at an alarming rate in numbers, complexity, and harmful purposes to compromise the security of modern computer systems. Recently, malware detection based on low-level hardware features (e.g., Hardware Performance Counters (HPCs) information) has emerged as an effective alternative solution to address the complexity and performance overheads of traditional software-based detection methods. Hardware-assisted Malware Detection (HMD) techniques depend on standard Machine Learning (ML) classifiers to detect signatures of malicious applications by monitoring built-in HPC registers during execution at run-time. Prior HMD methods though effective have limited their study on detecting malicious applications that are spawned as a separate thread during application execution, hence detecting stealthy malware patterns at run-time remains a critical challenge. Stealthy malware refers to harmful cyber attacks in which malicious code is hidden within benign applications and remains undetected by traditional malware detection approaches. In this paper, we first present a comprehensive review of recent advances in hardware-assisted malware detection studies that have used standard ML techniques to detect the malware signatures. Next, to address the challenge of stealthy malware detection at the processor’s hardware level, we propose StealthMiner, a novel specialized time series machine learning-based approach to accurately detect stealthy malware trace at run-time using branch instructions, the most prominent HPC feature. StealthMiner is based on a lightweight time series Fully Convolutional Neural Network (FCN) model that automatically identifies potentially contaminated samples in HPC-based time series data and utilizes them to accurately recognize the trace of stealthy malware. Our analysis demonstrates that using state-of-the-art ML-based malware detection methods is not effective in detecting stealthy malware samples since the captured HPC data not only represents malware but also carries benign applications’ microarchitectural data. The experimental results demonstrate that with the aid of our novel intelligent approach, stealthy malware can be detected at run-time with 94% detection performance on average with only one HPC feature, outperforming the detection performance of state-of-the-art HMD and general time series classification methods by up to 42% and 36%, respectively.


Algorithms ◽  
2018 ◽  
Vol 11 (8) ◽  
pp. 124 ◽  
Author(s):  
Yihong Li ◽  
Fangzheng Liu ◽  
Zhenyu Du ◽  
Dubing Zhang

In the malware detection process, obfuscated malicious codes cannot be efficiently and accurately detected solely in the dynamic or static feature space. Aiming at this problem, an integrative feature extraction algorithm based on simhash was proposed, which combines the static information e.g., API (Application Programming Interface) calls and dynamic information (such as file, registry and network behaviors) of malicious samples to form integrative features. The experiment extracts the integrative features of some static information and dynamic information, and then compares the classification, time and obfuscated-detection performance of the static, dynamic and integrated features, respectively, by using several common machine learning algorithms. The results show that the integrative features have better time performance than the static features, and better classification performance than the dynamic features, and almost the same obfuscated-detection performance as the dynamic features. This algorithm can provide some support for feature extraction of malware detection.


2020 ◽  
Vol 2020 (10) ◽  
pp. 310-1-310-7
Author(s):  
Khalid Omer ◽  
Luca Caucci ◽  
Meredith Kupinski

This work reports on convolutional neural network (CNN) performance on an image texture classification task as a function of linear image processing and number of training images. Detection performance of single and multi-layer CNNs (sCNN/mCNN) are compared to optimal observers. Performance is quantified by the area under the receiver operating characteristic (ROC) curve, also known as the AUC. For perfect detection AUC = 1.0 and AUC = 0.5 for guessing. The Ideal Observer (IO) maximizes AUC but is prohibitive in practice because it depends on high-dimensional image likelihoods. The IO performance is invariant to any fullrank, invertible linear image processing. This work demonstrates the existence of full-rank, invertible linear transforms that can degrade both sCNN and mCNN even in the limit of large quantities of training data. A subsequent invertible linear transform changes the images’ correlation structure again and can improve this AUC. Stationary textures sampled from zero mean and unequal covariance Gaussian distributions allow closed-form analytic expressions for the IO and optimal linear compression. Linear compression is a mitigation technique for high-dimension low sample size (HDLSS) applications. By definition, compression strictly decreases or maintains IO detection performance. For small quantities of training data, linear image compression prior to the sCNN architecture can increase AUC from 0.56 to 0.93. Results indicate an optimal compression ratio for CNN based on task difficulty, compression method, and number of training images.


2018 ◽  
Vol 6 (12) ◽  
pp. 879-887
Author(s):  
Om Prakash Samantray ◽  
Satya Narayana Tripathy ◽  
Susant Kumar Das

Sign in / Sign up

Export Citation Format

Share Document