Permission Sensitivity-Based Malicious Application Detection for Android

Tree Classifier ◽

Hybrid Detection

Android malware have risen exponentially over the past few years, posing several serious threats such as system damage, financial loss, and mobile botnets. Various detection techniques have been proposed in the literature for Android malware detection. Some of the techniques analyze static parameters such as permissions, or intents, whereas, others focus on dynamic parameters such as network traffic or system calls. Static techniques are relatively easier to implement, however, stealthy recent malware evade static detection by virtue of update attacks. Dynamic detection can be used to detect such stealthy malware, however, it increases the computation overhead. Hence, both kinds of techniques have their own advantages and disadvantages. In this paper, we have proposed an innovative hybrid detection model that uses both static and dynamic features for malware analysis and detection. We first rank the static and dynamic parameters according to the information gain and then apply machine learning algorithms in the testing phase. The results indicate that hybrid approach is better than both static and dynamic approaches and the proposed model achieves 98.9% detection accuracy with Decision Tree classifier

Mlifdect: Android Malware Detection Based on Parallel Machine Learning and Information Fusion

Security and Communication Networks ◽

10.1155/2017/6451260 ◽

2017 ◽

Vol 2017 ◽

pp. 1-14 ◽

Cited By ~ 8

Author(s):

Xin Wang ◽

Dafang Zhang ◽

Xin Su ◽

Wenjia Li

Keyword(s):

Machine Learning ◽

Information Fusion ◽

Malware Detection ◽

Parallel Machine ◽

Detection Methods ◽

Detection Accuracy ◽

Android Malware ◽

Detection Model ◽

Android Apps ◽

In recent years, Android malware has continued to grow at an alarming rate. More recent malicious apps’ employing highly sophisticated detection avoidance techniques makes the traditional machine learning based malware detection methods far less effective. More specifically, they cannot cope with various types of Android malware and have limitation in detection by utilizing a single classification algorithm. To address this limitation, we propose a novel approach in this paper that leverages parallel machine learning and information fusion techniques for better Android malware detection, which is named Mlifdect. To implement this approach, we first extract eight types of features from static analysis on Android apps and build two kinds of feature sets after feature selection. Then, a parallel machine learning detection model is developed for speeding up the process of classification. Finally, we investigate the probability analysis based and Dempster-Shafer theory based information fusion approaches which can effectively obtain the detection results. To validate our method, other state-of-the-art detection works are selected for comparison with real-world Android apps. The experimental results demonstrate that Mlifdect is capable of achieving higher detection accuracy as well as a remarkable run-time efficiency compared to the existing malware detection solutions.

Android Malware Detection Based on a Hybrid Deep Learning Model

Security and Communication Networks ◽

10.1155/2020/8863617 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Tianliang Lu ◽

Yanhui Du ◽

Li Ouyang ◽

Qiuyu Chen ◽

Xirui Wang

Keyword(s):

Deep Learning ◽

Learning Algorithms ◽

Malware Detection ◽

Learning Model ◽

Machine Learning Algorithms ◽

Detection Accuracy ◽

Dynamic Feature ◽

Android Malware ◽

Deep Learning Model

In recent years, the number of malware on the Android platform has been increasing, and with the widespread use of code obfuscation technology, the accuracy of antivirus software and traditional detection algorithms is low. Current state-of-the-art research shows that researchers started applying deep learning methods for malware detection. We proposed an Android malware detection algorithm based on a hybrid deep learning model which combines deep belief network (DBN) and gate recurrent unit (GRU). First of all, analyze the Android malware; in addition to extracting static features, dynamic behavioral features with strong antiobfuscation ability are also extracted. Then, build a hybrid deep learning model for Android malware detection. Because the static features are relatively independent, the DBN is used to process the static features. Because the dynamic features have temporal correlation, the GRU is used to process the dynamic feature sequence. Finally, the training results of DBN and GRU are input into the BP neural network, and the final classification results are output. Experimental results show that, compared with the traditional machine learning algorithms, the Android malware detection model based on hybrid deep learning algorithms has a higher detection accuracy, and it also has a better detection effect on obfuscated malware.

Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm

Mathematics ◽

10.3390/math9212813 ◽

2021 ◽

Vol 9 (21) ◽

pp. 2813

Author(s):

Jaehyeong Lee ◽

Hyuk Jang ◽

Sungmin Ha ◽

Yourim Yoon

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Genetic Algorithms ◽

Feature Selection ◽

Malware Detection ◽

Feature Selection Method ◽

Machine Learning Algorithms ◽

Android Malware ◽

Detection Techniques ◽

Since the discovery that machine learning can be used to effectively detect Android malware, many studies on machine learning-based malware detection techniques have been conducted. Several methods based on feature selection, particularly genetic algorithms, have been proposed to increase the performance and reduce costs. However, because they have yet to be compared with other methods and their many features have not been sufficiently verified, such methods have certain limitations. This study investigates whether genetic algorithm-based feature selection helps Android malware detection. We applied nine machine learning algorithms with genetic algorithm-based feature selection for 1104 static features through 5000 benign applications and 2500 malwares included in the Andro-AutoPsy dataset. Comparative experimental results show that the genetic algorithm performed better than the information gain-based method, which is generally used as a feature selection method. Moreover, machine learning using the proposed genetic algorithm-based feature selection has an absolute advantage in terms of time compared to machine learning without feature selection. The results indicate that incorporating genetic algorithms into Android malware detection is a valuable approach. Furthermore, to improve malware detection performance, it is useful to apply genetic algorithm-based feature selection to machine learning.

LSTM-Based Hierarchical Denoising Network for Android Malware Detection

Security and Communication Networks ◽

10.1155/2018/5249190 ◽

2018 ◽

Vol 2018 ◽

pp. 1-18 ◽

Cited By ~ 3

Author(s):

Jinpei Yan ◽

Yong Qi ◽

Qifan Rao

Keyword(s):

Detection Method ◽

Expert Knowledge ◽

Detection Efficiency ◽

Malware Detection ◽

Mobile Security ◽

Detection Methods ◽

Android Malware ◽

Malicious Behavior ◽

Gradient Scaling

Mobile security is an important issue on Android platform. Most malware detection methods based on machine learning models heavily rely on expert knowledge for manual feature engineering, which are still difficult to fully describe malwares. In this paper, we present LSTM-based hierarchical denoise network (HDN), a novel static Android malware detection method which uses LSTM to directly learn from the raw opcode sequences extracted from decompiled Android files. However, most opcode sequences are too long for LSTM to train due to the gradient vanishing problem. Hence, HDN uses a hierarchical structure, whose first-level LSTM parallelly computes on opcode subsequences (we called them method blocks) to learn the dense representations; then the second-level LSTM can learn and detect malware through method block sequences. Considering that malicious behavior only appears in partial sequence segments, HDN uses method block denoise module (MBDM) for data denoising by adaptive gradient scaling strategy based on loss cache. We evaluate and compare HDN with the latest mainstream researches on three datasets. The results show that HDN outperforms these Android malware detection methods,and it is able to capture longer sequence features and has better detection efficiency than N-gram-based malware detection which is similar to our method.

Runtime Detection Framework for Android Malware

Mobile Information Systems ◽

10.1155/2018/8094314 ◽

2018 ◽

Vol 2018 ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

TaeGuen Kim ◽

BooJoong Kang ◽

Eul Gyu Im

Keyword(s):

Dynamic Analysis ◽

Static Analysis ◽

Suffix Tree ◽

Malware Detection ◽

Application Programming Interface ◽

Detection Methods ◽

Detection Accuracy ◽

Dynamic Features ◽

Android Malware ◽

As the number of Android malware has been increased rapidly over the years, various malware detection methods have been proposed so far. Existing methods can be classified into two categories: static analysis-based methods and dynamic analysis-based methods. Both approaches have some limitations: static analysis-based methods are relatively easy to be avoided through transformation techniques such as junk instruction insertions, code reordering, and so on. However, dynamic analysis-based methods also have some limitations that analysis overheads are relatively high and kernel modification might be required to extract dynamic features. In this paper, we propose a dynamic analysis framework for Android malware detection that overcomes the aforementioned shortcomings. The framework uses a suffix tree that contains API (Application Programming Interface) subtraces and their probabilistic confidence values that are generated using HMMs (Hidden Markov Model) to reduce the malware detection overhead, and we designed the framework with the client-server architecture since the suffix tree is infeasible to be deployed in mobile devices. In addition, an application rewriting technique is used to trace API invocations without any modifications in the Android kernel. In our experiments, we measured the detection accuracy and the computational overheads to evaluate its effectiveness and efficiency of the proposed framework.

Chimera: An Android Malware Detection Method Based on Multimodal Deep Learning and Hybrid Analysis

10.36227/techrxiv.13359767.v1 ◽

2020 ◽

Author(s):

Angelo Schranko de Oliveira ◽

Renato José Sassi

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Dynamic Analysis ◽

Detection Method ◽

Malware Detection ◽

Analysis Data ◽

Detection Methods ◽

Feature Engineering ◽

Android Malware ◽

<div>The Android Operating System (OS) everywhere, computers, cars, homes, and, of course, personal and corporate smartphones. A recent survey from the International Data Corporation (IDC) reveals that the Android platform holds 85% of the smartphone market share. Its popularity and open nature make it an attractive target for malware. According to AV-TEST, by November 2020, 2.87M new Android malware instances were identified in the wild. Malware detection is a challenging problem that has been actively explored by both the industry and academia using intelligent methods. On the one hand, traditional machine learning (ML) malware detection methods rely on manual feature engineering that requires expert knowledge. On the other hand, deep learning (DL) malware detection methods perform automatic feature extraction but usually require much more data and processing power. In this work, we propose a new multimodal DL Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executable (DEX) grayscale images), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid, which contains static and dynamic analysis data extracted from 22,000 real malware and goodware samples. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, Recall, and ROC AUC outperform classical ML algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.</div>

Chimera: An Android Malware Detection Method Based on Multimodal Deep Learning and Hybrid Analysis

10.36227/techrxiv.13359767 ◽

2020 ◽

Author(s):

Angelo Schranko de Oliveira ◽

Renato José Sassi

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Dynamic Analysis ◽

Detection Method ◽

Malware Detection ◽

Analysis Data ◽

Detection Methods ◽

Feature Engineering ◽

Android Malware ◽

<div>The Android Operating System (OS) everywhere, computers, cars, homes, and, of course, personal and corporate smartphones. A recent survey from the International Data Corporation (IDC) reveals that the Android platform holds 85% of the smartphone market share. Its popularity and open nature make it an attractive target for malware. According to AV-TEST, by November 2020, 2.87M new Android malware instances were identified in the wild. Malware detection is a challenging problem that has been actively explored by both the industry and academia using intelligent methods. On the one hand, traditional machine learning (ML) malware detection methods rely on manual feature engineering that requires expert knowledge. On the other hand, deep learning (DL) malware detection methods perform automatic feature extraction but usually require much more data and processing power. In this work, we propose a new multimodal DL Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executable (DEX) grayscale images), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid, which contains static and dynamic analysis data extracted from 22,000 real malware and goodware samples. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, Recall, and ROC AUC outperform classical ML algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.</div>

An Android Mutation Malware Detection Based on Deep Learning Using Visualization of Importance from Codes

10.20944/preprints201808.0034.v1 ◽

2018 ◽

Author(s):

Yao-Saint Yen ◽

Hung-Min Sun

Keyword(s):

Malware Detection ◽

Control Flow ◽

Detection Methods ◽

System Call ◽

Importance Value ◽

Android Malware ◽

Feature Extraction Method ◽

Market Shares ◽

Private Data

Using smartphone especially android platform has already got eighty percent market shares, due to aforementioned report, it becomes attacker’s primary goal. There is a growing number of private data onto smart phones and low safety defense measure, attackers can use multiple way to launch and to attack user’s smartphones.(e.g. Using different coding style to confuse the software of detecting malware). Existing android malware detection methods use multiple features, like safety sensor API, system call, control flow structure and data information flow, then using machine learning to check whether its malware or not. These feature provide app’s unique property and limitation, that is to say, from some perspectives it might suit for some specific attack, but wouldn’t suit for others. Nowadays most malware detection methods use only one aforementioned feature, and these methods mostly analysis to detect code, but facing the influence of malware’s code confusion and zero-day attack, aforementioned feature extraction method may cause wrong judge. So, it’s necessary to design an effective technique analysis to prevent malware. In this paper, we use the importance of word from apk, because of code confusion, some malware attackers only rename variables, if using general static analysis wouldn’t judge correctly, then use these importance value to go through our proposed method to generate picture, finally using convolutional neural network to see whether the apk file is malware or not.