Analysis of Effects of Image Format on Detection Performance and Resource Usage in CNN-Based Malware Detection

Seong-hyeon Byeon;  ; Young-won Kim; Kwan-seob Ko; Soo-jin Lee

doi:10.33778/kcsa.2021.21.4.069

Enhancing Smartphone Malware Detection Performance by Applying Machine Learning Hybrid Classifiers

Computer Applications for Software Engineering, Disaster Recovery, and Business Continuity - Communications in Computer and Information Science ◽

10.1007/978-3-642-35267-6_17 ◽

2012 ◽

pp. 131-137 ◽

Cited By ~ 5

Author(s):

Abdelfattah Amamra ◽

Chamseddine Talhi ◽

Jean-Marc Robert ◽

Martin Hamiche

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Detection Performance ◽

Hybrid Classifiers

Download Full-text

Towards Accurate Run-Time Hardware-Assisted Stealthy Malware Detection: A Lightweight, Yet Effective Time Series CNN-Based Approach

Cryptography ◽

10.3390/cryptography5040028 ◽

2021 ◽

Vol 5 (4) ◽

pp. 28

Author(s):

Hossein Sayadi ◽

Yifeng Gao ◽

Hosein Mohammadi Makrani ◽

Jessica Lin ◽

Paulo Cesar Costa ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

State Of The Art ◽

Malware Detection ◽

Detection Performance ◽

Malicious Code ◽

Detection Methods ◽

Series Data ◽

Run Time

According to recent security analysis reports, malicious software (a.k.a. malware) is rising at an alarming rate in numbers, complexity, and harmful purposes to compromise the security of modern computer systems. Recently, malware detection based on low-level hardware features (e.g., Hardware Performance Counters (HPCs) information) has emerged as an effective alternative solution to address the complexity and performance overheads of traditional software-based detection methods. Hardware-assisted Malware Detection (HMD) techniques depend on standard Machine Learning (ML) classifiers to detect signatures of malicious applications by monitoring built-in HPC registers during execution at run-time. Prior HMD methods though effective have limited their study on detecting malicious applications that are spawned as a separate thread during application execution, hence detecting stealthy malware patterns at run-time remains a critical challenge. Stealthy malware refers to harmful cyber attacks in which malicious code is hidden within benign applications and remains undetected by traditional malware detection approaches. In this paper, we first present a comprehensive review of recent advances in hardware-assisted malware detection studies that have used standard ML techniques to detect the malware signatures. Next, to address the challenge of stealthy malware detection at the processor’s hardware level, we propose StealthMiner, a novel specialized time series machine learning-based approach to accurately detect stealthy malware trace at run-time using branch instructions, the most prominent HPC feature. StealthMiner is based on a lightweight time series Fully Convolutional Neural Network (FCN) model that automatically identifies potentially contaminated samples in HPC-based time series data and utilizes them to accurately recognize the trace of stealthy malware. Our analysis demonstrates that using state-of-the-art ML-based malware detection methods is not effective in detecting stealthy malware samples since the captured HPC data not only represents malware but also carries benign applications’ microarchitectural data. The experimental results demonstrate that with the aid of our novel intelligent approach, stealthy malware can be detected at run-time with 94% detection performance on average with only one HPC feature, outperforming the detection performance of state-of-the-art HMD and general time series classification methods by up to 42% and 36%, respectively.

Download Full-text

Analysis of Android malware detection performance using machine learning classifiers

2013 International Conference on ICT Convergence (ICTC) ◽

10.1109/ictc.2013.6675404 ◽

2013 ◽

Cited By ~ 7

Author(s):

Hyo-Sik Ham ◽

Mi-Jung Choi

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Detection Performance ◽

Android Malware ◽

Machine Learning Classifiers ◽

Android Malware Detection ◽

Learning Classifiers

Download Full-text

A Simhash-Based Integrative Features Extraction Algorithm for Malware Detection

Algorithms ◽

10.3390/a11080124 ◽

2018 ◽

Vol 11 (8) ◽

pp. 124 ◽

Cited By ~ 1

Author(s):

Yihong Li ◽

Fangzheng Liu ◽

Zhenyu Du ◽

Dubing Zhang

Keyword(s):

Feature Extraction ◽

Malware Detection ◽

Application Programming Interface ◽

Classification Performance ◽

Detection Performance ◽

Machine Learning Algorithms ◽

Dynamic Features ◽

Dynamic Information ◽

Static Information ◽

Extraction Algorithm

In the malware detection process, obfuscated malicious codes cannot be efficiently and accurately detected solely in the dynamic or static feature space. Aiming at this problem, an integrative feature extraction algorithm based on simhash was proposed, which combines the static information e.g., API (Application Programming Interface) calls and dynamic information (such as file, registry and network behaviors) of malicious samples to form integrative features. The experiment extracts the integrative features of some static information and dynamic information, and then compares the classification, time and obfuscated-detection performance of the static, dynamic and integrated features, respectively, by using several common machine learning algorithms. The results show that the integrative features have better time performance than the static features, and better classification performance than the dynamic features, and almost the same obfuscated-detection performance as the dynamic features. This algorithm can provide some support for feature extraction of malware detection.

Download Full-text

Impact of Dataset Representation on Smartphone Malware Detection Performance

Trust Management VII - IFIP Advances in Information and Communication Technology ◽

10.1007/978-3-642-38323-6_12 ◽

2013 ◽

pp. 166-176 ◽

Cited By ~ 3

Author(s):

Abdelfattah Amamra ◽

Chamseddine Talhi ◽

Jean-Marc Robert

Keyword(s):

Malware Detection ◽

Detection Performance

Download Full-text

Understanding Target Detection Performance When Receiving Visual or Auditory Cues

PsycEXTRA Dataset ◽

10.1037/e618602012-013 ◽

2002 ◽

Author(s):

David C. Cibik ◽

Erich W. Meyerhoff

Keyword(s):

Target Detection ◽

Detection Performance ◽

Auditory Cues

Download Full-text

Correlation Analysis of Dataset Size and Accuracy of the CNN-based Malware Detection Algorithm

Jouranl of Information and Security ◽

10.33778/kcsa.2020.20.3.053 ◽

2020 ◽

Vol 20 (3) ◽

pp. 53-60

Author(s):

Dong Jun Choi ◽

◽

Jae Woo Lee

Keyword(s):

Correlation Analysis ◽

Malware Detection ◽

Detection Algorithm ◽

Dataset Size

Download Full-text

CNN performance dependence on linear image processing

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.10.ipas-182 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 310-1-310-7

Author(s):

Khalid Omer ◽

Luca Caucci ◽

Meredith Kupinski

Keyword(s):

Image Processing ◽

Texture Classification ◽

Full Rank ◽

Detection Performance ◽

Ideal Observer ◽

Training Data ◽

Image Texture ◽

Training Images ◽

Analytic Expressions ◽

Linear Compression

This work reports on convolutional neural network (CNN) performance on an image texture classification task as a function of linear image processing and number of training images. Detection performance of single and multi-layer CNNs (sCNN/mCNN) are compared to optimal observers. Performance is quantified by the area under the receiver operating characteristic (ROC) curve, also known as the AUC. For perfect detection AUC = 1.0 and AUC = 0.5 for guessing. The Ideal Observer (IO) maximizes AUC but is prohibitive in practice because it depends on high-dimensional image likelihoods. The IO performance is invariant to any fullrank, invertible linear image processing. This work demonstrates the existence of full-rank, invertible linear transforms that can degrade both sCNN and mCNN even in the limit of large quantities of training data. A subsequent invertible linear transform changes the images’ correlation structure again and can improve this AUC. Stationary textures sampled from zero mean and unequal covariance Gaussian distributions allow closed-form analytic expressions for the IO and optimal linear compression. Linear compression is a mitigation technique for high-dimension low sample size (HDLSS) applications. By definition, compression strictly decreases or maintains IO detection performance. For small quantities of training data, linear image compression prior to the sCNN architecture can increase AUC from 0.56 to 0.93. Results indicate an optimal compression ratio for CNN based on task difficulty, compression method, and number of training images.

Download Full-text