A study on robustness of malware detection model

Author(s):  
Wanjia Zheng ◽  
Kazumasa Omote
2021 ◽  
Vol 15 (4) ◽  
pp. 18-30
Author(s):  
Om Prakash Samantray ◽  
Satya Narayan Tripathy

There are several malware detection techniques available that are based on a signature-based approach. This approach can detect known malware very effectively but sometimes may fail to detect unknown or zero-day attacks. In this article, the authors have proposed a malware detection model that uses operation codes of malicious and benign executables as the feature. The proposed model uses opcode extract and count (OPEC) algorithm to prepare the opcode feature vector for the experiment. Most relevant features are selected using extra tree classifier feature selection technique and then passed through several supervised learning algorithms like support vector machine, naive bayes, decision tree, random forest, logistic regression, and k-nearest neighbour to build classification models for malware detection. The proposed model has achieved a detection accuracy of 98.7%, which makes this model better than many of the similar works discussed in the literature.


2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Guanghui Liang ◽  
Jianmin Pang ◽  
Zheng Shan ◽  
Runqing Yang ◽  
Yihang Chen

To address emerging security threats, various malware detection methods have been proposed every year. Therefore, a small but representative set of malware samples are usually needed for detection model, especially for machine-learning-based malware detection models. However, current manual selection of representative samples from large unknown file collection is labor intensive and not scalable. In this paper, we firstly propose a framework that can automatically generate a small data set for malware detection. With this framework, we extract behavior features from a large initial data set and then use a hierarchical clustering technique to identify different types of malware. An improved genetic algorithm based on roulette wheel sampling is implemented to generate final test data set. The final data set is only one-eighteenth the volume of the initial data set, and evaluations show that the data set selected by the proposed framework is much smaller than the original one but does not lose nearly any semantics.


2021 ◽  
Author(s):  
Vinayaka K V ◽  
Jaidhar C D

<pre> The popularity of the Android Operating System in the smartphone market has given rise to lots of Android malware. To accurately detect these malware, many of the existing works use machine learning and deep learning-based methods, in which feature extraction methods were used to extract fixed-size feature vectors using the files present inside the Android Application Package (APK). Recently, Graph Convolutional Network (GCN) based methods applied on the Function Call Graph (FCG) extracted from the APK are gaining momentum in Android malware detection, as GCNs are effective at learning tasks on variable-sized graphs such as FCG, and FCG sufficiently captures the structure and behaviour of an APK. However, the FCG lacks information about callback methods as the Android Application Programming Interface (API) is event-driven. This paper proposes enhancing the FCG to eFCG (enhanced-FCG) using the callback information extracted using Android Framework Space Analysis to overcome this limitation. Further, we add permission - API method relationships to the eFCG. The eFCG is reduced using node contraction based on the classes to get R-eFCG (Reduced eFCG) to improve the generalisation ability of the Android malware detection model. The eFCG and R-eFCG are then given as the inputs to the Heterogeneous GCN models to determine whether the APK file from which they are extracted is malicious or not. To test the effectiveness of eFCG and R-eFCG, we conducted an ablation study by removing their various components. To determine the optimal neighbourhood size for GCN, we experimented with a varying number of GCN layers and found that the Android malware detection model using R-eFCG with all its components with four convolution layers achieved maximum accuracy of 96.28%.</pre>


2021 ◽  
Author(s):  
Vinayaka K V ◽  
Jaidhar C D

<pre> The popularity of the Android Operating System in the smartphone market has given rise to lots of Android malware. To accurately detect these malware, many of the existing works use machine learning and deep learning-based methods, in which feature extraction methods were used to extract fixed-size feature vectors using the files present inside the Android Application Package (APK). Recently, Graph Convolutional Network (GCN) based methods applied on the Function Call Graph (FCG) extracted from the APK are gaining momentum in Android malware detection, as GCNs are effective at learning tasks on variable-sized graphs such as FCG, and FCG sufficiently captures the structure and behaviour of an APK. However, the FCG lacks information about callback methods as the Android Application Programming Interface (API) is event-driven. This paper proposes enhancing the FCG to eFCG (enhanced-FCG) using the callback information extracted using Android Framework Space Analysis to overcome this limitation. Further, we add permission - API method relationships to the eFCG. The eFCG is reduced using node contraction based on the classes to get R-eFCG (Reduced eFCG) to improve the generalisation ability of the Android malware detection model. The eFCG and R-eFCG are then given as the inputs to the Heterogeneous GCN models to determine whether the APK file from which they are extracted is malicious or not. To test the effectiveness of eFCG and R-eFCG, we conducted an ablation study by removing their various components. To determine the optimal neighbourhood size for GCN, we experimented with a varying number of GCN layers and found that the Android malware detection model using R-eFCG with all its components with four convolution layers achieved maximum accuracy of 96.28%.</pre>


PLoS ONE ◽  
2020 ◽  
Vol 15 (4) ◽  
pp. e0231626
Author(s):  
Yong Fang ◽  
Yuetian Zeng ◽  
Beibei Li ◽  
Liang Liu ◽  
Lei Zhang

Sign in / Sign up

Export Citation Format

Share Document