scholarly journals SFDroid: Android Malware Detection using Ranked Static Features

Author(s):  
Gourav Garg ◽  
Ashutosh Sharma* ◽  
Anshul Arora

Over the past few years, malware attacks have risen in huge numbers on the Android platform. Significant threats are posed by these attacks which may cause financial loss, information leakage, and damage to the system. Around 25 million smartphones were infected with malware within the first half of 2019 that depicts the seriousness of these attacks. Taking into account the danger posed by the Android malware to the users' community, we aim to develop a static Android malware detector named SFDroid that analyzes manifest file components for malware detection. In this work, first, the proposed model ranks the manifest features according to their frequency in normal and malicious apps. This helps us to identify the significant features present in normal and malware datasets. Additionally, we apply support thresholds to remove the unnecessary and redundant features from the rankings. Further, we propose a novel algorithm that uses the ranked features, and several machine learning classifiers to detect Android malware. The experimental results demonstrate that by using the Random Forest classifier at 10% support threshold, the proposed model gives a detection accuracy of 95.90% with 36 manifest components.

Author(s):  
Kartik Khariwal* ◽  
Rishabh Gupta ◽  
Jatin Singh ◽  
Anshul Arora

With the increasing fame of Android OS over the past few years, the quantity of malware assaults on Android has additionally expanded. In the year 2018, around 28 million malicious applications were found on the Android platform and these malicious apps were capable of causing huge financial losses and information leakage. Such threats, caused due to these malicious apps, call for a proper detection system for Android malware. There exist some research works that aim to study static manifest components for malware detection. However, to the best of our knowledge, none of the previous research works have aimed to find the best set amongst different manifest file components for malware detection. In this work, we focus on identifying the best feature set from manifest file components (Permissions, Intents, Hardware Components, Activities, Services, Broadcast Receivers, and Content Providers) that could give better detection accuracy. We apply Information Gain to rank the manifest file components intending to find the best set of components that can better classify between malware applications and benign applications. We put forward a novel algorithm to find the best feature set by using various machine learning classifiers like SVM, XGBoost, and Random Forest along with deep learning techniques like classification using Neural networks. The experimental results highlight that the best set obtained from the proposed algorithm consisted of 25 features, i.e., 5 Permissions, 2 Intents, 9 Activities, 3 Content Providers, 4 Hardware Components, 1 Service, and 1 Broadcast Receiver. The SVM classifier gave the highest classification accuracy of 96.93% and an F1-Score of 0.97 with this best set of 25 features.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yubo Song ◽  
Yijin Geng ◽  
Junbo Wang ◽  
Shang Gao ◽  
Wei Shi

Since a growing number of malicious applications attempt to steal users’ private data by illegally invoking permissions, application stores have carried out many malware detection methods based on application permissions. However, most of them ignore specific permission combinations and application categories that affect the detection accuracy. The features they extracted are neither representative enough to distinguish benign and malicious applications. For these problems, an Android malware detection method based on permission sensitivity is proposed. First, for each kind of application categories, the permission features and permission combination features are extracted. The sensitive permission feature set corresponding to each category label is then obtained by the feature selection method based on permission sensitivity. In the following step, the permission call situation of the application to be detected is compared with the sensitive permission feature set, and the weight allocation method is used to quantify this information into numerical features. In the proposed method of malicious application detection, three machine-learning algorithms are selected to construct the classifier model and optimize the parameters. Compared with traditional methods, the proposed method consumed 60.94% less time while still achieving high accuracy of up to 92.17%.


Author(s):  
Jarrett Booz ◽  
Josh McGiff ◽  
William G. Hatcher ◽  
Wei Yu ◽  
James Nguyen ◽  
...  

In this article, the authors implement a deep learning environment and fine-tune parameters to determine the optimal settings for the classification of Android malware from extracted permission data. By determining the optimal settings, the authors demonstrate the potential performance of a deep learning environment for Android malware detection. Specifically, an extensive study is conducted on various hyper-parameters to determine optimal configurations, and then a performance evaluation is carried out on those configurations to compare and maximize detection accuracy in our target networks. The results achieve a detection accuracy of approximately 95%, with an approximate F1 score of 93%. In addition, the evaluation is extended to include other machine learning frameworks, specifically comparing Microsoft Cognitive Toolkit (CNTK) and Theano with TensorFlow. The future needs are discussed in the realm of machine learning for mobile malware detection, including adversarial training, scalability, and the evaluation of additional data and features.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Hyo-Sik Ham ◽  
Hwan-Hee Kim ◽  
Myung-Sup Kim ◽  
Mi-Jung Choi

Current many Internet of Things (IoT) services are monitored and controlled through smartphone applications. By combining IoT with smartphones, many convenient IoT services have been provided to users. However, there are adverse underlying effects in such services including invasion of privacy and information leakage. In most cases, mobile devices have become cluttered with important personal user information as various services and contents are provided through them. Accordingly, attackers are expanding the scope of their attacks beyond the existing PC and Internet environment into mobile devices. In this paper, we apply a linear support vector machine (SVM) to detect Android malware and compare the malware detection performance of SVM with that of other machine learning classifiers. Through experimental validation, we show that the SVM outperforms other machine learning classifiers.


Author(s):  
Siddhant Gupta ◽  
Siddharth Sethi ◽  
Srishti Chaudhary ◽  
Anshul Arora

Android mobile devices are a prime target for a huge number of cyber-criminals as they aim to create malware for disrupting and damaging the servers, clients, or networks. Android malware are in the form of malicious apps, that get downloaded on mobile devices via the Play Store or third-party app markets. Such malicious apps pose serious threats like system damage, information leakage, financial loss to user, etc. Thus, predicting which apps contain malicious behavior will help in preventing malware attacks on mobile devices. Identifying Android malware has become a major challenge because of the ever-increasing number of permissions that applications ask for, to enhance the experience of the users. And most of the times, permissions and other features defined in normal and malicious apps are generally the same. In this paper, we aim to detect Android malware using machine learning, deep learning, and natural language processing techniques. To delve into the problem, we use the Android manifest files which provide us with features like permissions which become the basis for detecting Android malware. We have used the concept of information value for ranking permissions. Further, we have proposed a consensus-based blockchain framework for making more concrete predictions as blockchain have high reliability and low cost. The experimental results demonstrate that the proposed model gives the detection accuracy of 95.44% with the Random Forest classifier. This accuracy is achieved with top 45 permissions ranked according to Information Value.


Android malware have risen exponentially over the past few years, posing several serious threats such as system damage, financial loss, and mobile botnets. Various detection techniques have been proposed in the literature for Android malware detection. Some of the techniques analyze static parameters such as permissions, or intents, whereas, others focus on dynamic parameters such as network traffic or system calls. Static techniques are relatively easier to implement, however, stealthy recent malware evade static detection by virtue of update attacks. Dynamic detection can be used to detect such stealthy malware, however, it increases the computation overhead. Hence, both kinds of techniques have their own advantages and disadvantages. In this paper, we have proposed an innovative hybrid detection model that uses both static and dynamic features for malware analysis and detection. We first rank the static and dynamic parameters according to the information gain and then apply machine learning algorithms in the testing phase. The results indicate that hybrid approach is better than both static and dynamic approaches and the proposed model achieves 98.9% detection accuracy with Decision Tree classifier


2019 ◽  
Vol 7 (4) ◽  
pp. 1-24 ◽  
Author(s):  
Jarrett Booz ◽  
Josh McGiff ◽  
William G. Hatcher ◽  
Wei Yu ◽  
James Nguyen ◽  
...  

In this article, the authors implement a deep learning environment and fine-tune parameters to determine the optimal settings for the classification of Android malware from extracted permission data. By determining the optimal settings, the authors demonstrate the potential performance of a deep learning environment for Android malware detection. Specifically, an extensive study is conducted on various hyper-parameters to determine optimal configurations, and then a performance evaluation is carried out on those configurations to compare and maximize detection accuracy in our target networks. The results achieve a detection accuracy of approximately 95%, with an approximate F1 score of 93%. In addition, the evaluation is extended to include other machine learning frameworks, specifically comparing Microsoft Cognitive Toolkit (CNTK) and Theano with TensorFlow. The future needs are discussed in the realm of machine learning for mobile malware detection, including adversarial training, scalability, and the evaluation of additional data and features.


2017 ◽  
Vol 2017 ◽  
pp. 1-14 ◽  
Author(s):  
Xin Wang ◽  
Dafang Zhang ◽  
Xin Su ◽  
Wenjia Li

In recent years, Android malware has continued to grow at an alarming rate. More recent malicious apps’ employing highly sophisticated detection avoidance techniques makes the traditional machine learning based malware detection methods far less effective. More specifically, they cannot cope with various types of Android malware and have limitation in detection by utilizing a single classification algorithm. To address this limitation, we propose a novel approach in this paper that leverages parallel machine learning and information fusion techniques for better Android malware detection, which is named Mlifdect. To implement this approach, we first extract eight types of features from static analysis on Android apps and build two kinds of feature sets after feature selection. Then, a parallel machine learning detection model is developed for speeding up the process of classification. Finally, we investigate the probability analysis based and Dempster-Shafer theory based information fusion approaches which can effectively obtain the detection results. To validate our method, other state-of-the-art detection works are selected for comparison with real-world Android apps. The experimental results demonstrate that Mlifdect is capable of achieving higher detection accuracy as well as a remarkable run-time efficiency compared to the existing malware detection solutions.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Tianliang Lu ◽  
Yanhui Du ◽  
Li Ouyang ◽  
Qiuyu Chen ◽  
Xirui Wang

In recent years, the number of malware on the Android platform has been increasing, and with the widespread use of code obfuscation technology, the accuracy of antivirus software and traditional detection algorithms is low. Current state-of-the-art research shows that researchers started applying deep learning methods for malware detection. We proposed an Android malware detection algorithm based on a hybrid deep learning model which combines deep belief network (DBN) and gate recurrent unit (GRU). First of all, analyze the Android malware; in addition to extracting static features, dynamic behavioral features with strong antiobfuscation ability are also extracted. Then, build a hybrid deep learning model for Android malware detection. Because the static features are relatively independent, the DBN is used to process the static features. Because the dynamic features have temporal correlation, the GRU is used to process the dynamic feature sequence. Finally, the training results of DBN and GRU are input into the BP neural network, and the final classification results are output. Experimental results show that, compared with the traditional machine learning algorithms, the Android malware detection model based on hybrid deep learning algorithms has a higher detection accuracy, and it also has a better detection effect on obfuscated malware.


2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
TaeGuen Kim ◽  
BooJoong Kang ◽  
Eul Gyu Im

As the number of Android malware has been increased rapidly over the years, various malware detection methods have been proposed so far. Existing methods can be classified into two categories: static analysis-based methods and dynamic analysis-based methods. Both approaches have some limitations: static analysis-based methods are relatively easy to be avoided through transformation techniques such as junk instruction insertions, code reordering, and so on. However, dynamic analysis-based methods also have some limitations that analysis overheads are relatively high and kernel modification might be required to extract dynamic features. In this paper, we propose a dynamic analysis framework for Android malware detection that overcomes the aforementioned shortcomings. The framework uses a suffix tree that contains API (Application Programming Interface) subtraces and their probabilistic confidence values that are generated using HMMs (Hidden Markov Model) to reduce the malware detection overhead, and we designed the framework with the client-server architecture since the suffix tree is infeasible to be deployed in mobile devices. In addition, an application rewriting technique is used to trace API invocations without any modifications in the Android kernel. In our experiments, we measured the detection accuracy and the computational overheads to evaluate its effectiveness and efficiency of the proposed framework.


Sign in / Sign up

Export Citation Format

Share Document