Android Malware Detection Methods Based on the Combination of Clustering and Classification

Since a growing number of malicious applications attempt to steal users’ private data by illegally invoking permissions, application stores have carried out many malware detection methods based on application permissions. However, most of them ignore specific permission combinations and application categories that affect the detection accuracy. The features they extracted are neither representative enough to distinguish benign and malicious applications. For these problems, an Android malware detection method based on permission sensitivity is proposed. First, for each kind of application categories, the permission features and permission combination features are extracted. The sensitive permission feature set corresponding to each category label is then obtained by the feature selection method based on permission sensitivity. In the following step, the permission call situation of the application to be detected is compared with the sensitive permission feature set, and the weight allocation method is used to quantify this information into numerical features. In the proposed method of malicious application detection, three machine-learning algorithms are selected to construct the classifier model and optimize the parameters. Compared with traditional methods, the proposed method consumed 60.94% less time while still achieving high accuracy of up to 92.17%.

Download Full-text

Mlifdect: Android Malware Detection Based on Parallel Machine Learning and Information Fusion

Security and Communication Networks ◽

10.1155/2017/6451260 ◽

2017 ◽

Vol 2017 ◽

pp. 1-14 ◽

Cited By ~ 8

Author(s):

Xin Wang ◽

Dafang Zhang ◽

Xin Su ◽

Wenjia Li

Keyword(s):

Machine Learning ◽

Information Fusion ◽

Malware Detection ◽

Parallel Machine ◽

Detection Methods ◽

Detection Accuracy ◽

Android Malware ◽

Detection Model ◽

Android Apps ◽

Android Malware Detection

In recent years, Android malware has continued to grow at an alarming rate. More recent malicious apps’ employing highly sophisticated detection avoidance techniques makes the traditional machine learning based malware detection methods far less effective. More specifically, they cannot cope with various types of Android malware and have limitation in detection by utilizing a single classification algorithm. To address this limitation, we propose a novel approach in this paper that leverages parallel machine learning and information fusion techniques for better Android malware detection, which is named Mlifdect. To implement this approach, we first extract eight types of features from static analysis on Android apps and build two kinds of feature sets after feature selection. Then, a parallel machine learning detection model is developed for speeding up the process of classification. Finally, we investigate the probability analysis based and Dempster-Shafer theory based information fusion approaches which can effectively obtain the detection results. To validate our method, other state-of-the-art detection works are selected for comparison with real-world Android apps. The experimental results demonstrate that Mlifdect is capable of achieving higher detection accuracy as well as a remarkable run-time efficiency compared to the existing malware detection solutions.

Download Full-text

Android Malware Detection Based on Structural Features of the Function Call Graph

Electronics ◽

10.3390/electronics10020186 ◽

2021 ◽

Vol 10 (2) ◽

pp. 186

Author(s):

Yang Yang ◽

Xuehui Du ◽

Zhi Yang ◽

Xing Liu

Keyword(s):

Malware Detection ◽

Structural Features ◽

Coarse Grained ◽

Detection Methods ◽

Convolutional Network ◽

Android Malware ◽

Call Graph ◽

Android Apps ◽

Android Malware Detection ◽

Function Call

The openness of Android operating system not only brings convenience to users, but also leads to the attack threat from a large number of malicious applications (apps). Thus malware detection has become the research focus in the field of mobile security. In order to solve the problem of more coarse-grained feature selection and larger feature loss of graph structure existing in the current detection methods, we put forward a method named DGCNDroid for Android malware detection, which is based on the deep graph convolutional network. Our method starts by generating a function call graph for the decompiled Android application. Then the function call subgraph containing the sensitive application programming interface (API) is extracted. Finally, the function call subgraphs with structural features are trained as the input of the deep graph convolutional network. Thus the detection and classification of malicious apps can be realized. Through experimentation on a dataset containing 11,120 Android apps, the method proposed in this paper can achieve detection accuracy of 98.2%, which is higher than other existing detection methods.

Download Full-text

Dealing with Class Imbalance in Android Malware Detection by Cascading Clustering and Classification

Complex Pattern Mining - Studies in Computational Intelligence ◽

10.1007/978-3-030-36617-9_11 ◽

2020 ◽

pp. 173-187 ◽

Cited By ~ 2

Author(s):

Giuseppina Andresini ◽

Annalisa Appice ◽

Donato Malerba

Keyword(s):

Malware Detection ◽

Class Imbalance ◽

Android Malware ◽

Android Malware Detection ◽

Clustering And Classification

Download Full-text

BrainShield: A Hybrid Machine Learning-Based Malware Detection Model for Android Devices

Electronics ◽

10.3390/electronics10232948 ◽

2021 ◽

Vol 10 (23) ◽

pp. 2948

Author(s):

Corentin Rodrigo ◽

Samuel Pierre ◽

Ronald Beaubrun ◽

Franjieh El Khoury

Keyword(s):

Neural Network ◽

Malware Detection ◽

Detection Methods ◽

Dynamic Features ◽

Android Malware ◽

Detection Model ◽

The Third ◽

Android Malware Detection ◽

Fully Connected ◽

Server Architecture

Android has become the leading operating system for mobile devices, and the most targeted one by malware. Therefore, many analysis methods have been proposed for detecting Android malware. However, few of them use proper datasets for evaluation. In this paper, we propose BrainShield, a hybrid malware detection model trained on the Omnidroid dataset to reduce attacks on Android devices. The latter is the most diversified dataset in terms of the number of different features, and contains the largest number of samples, 22,000 samples, for model evaluation in the Android malware detection field. BrainShield’s implementation is based on a client/server architecture and consists of three fully connected neural networks: (1) the first is used for static analysis and reaches an accuracy of 92.9% trained on 840 static features; (2) the second is a dynamic neural network that reaches an accuracy of 81.1% trained on 3722 dynamic features; and (3) the third neural network proposed is hybrid, reaching an accuracy of 91.1% trained on 7081 static and dynamic features. Simulation results show that BrainShield is able to improve the accuracy and the precision of well-known malware detection methods.

Download Full-text

A Comprehensive Study of Malware Detection in Android Operating Systems

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v10i430248 ◽

2021 ◽

pp. 30-46

Author(s):

Suhaib Jasim Hamdi ◽

Ibrahim Mahmood Ibrahim ◽

Naaman Omar ◽

Omar M. Ahmed ◽

Zryan Najat Rashid ◽

...

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Detailed Comparison ◽

Detection Methods ◽

Current Status ◽

Android Malware ◽

Detection Techniques ◽

Android Apps ◽

Android Malware Detection ◽

Wide Range

Android is now the world's (or one of the world’s) most popular operating system. More and more malware assaults are taking place in Android applications. Many security detection techniques based on Android Apps are now available. The open environmental feature of the Android environment has given Android an extensive appeal in recent years. The growing number of mobile devices are incorporated in many aspects of our everyday lives. This paper gives a detailed comparison that summarizes and analyses various detection techniques. This work examines the current status of Android malware detection methods, with an emphasis on Machine Learning-based classifiers for detecting malicious software on Android devices. Android has a huge number of apps that may be downloaded and used for free. Consequently, Android phones are more susceptible to malware. As a result, additional research has been done in order to develop effective malware detection methods. To begin, several of the currently available Android malware detection approaches are carefully examined and classified based on their detection methodologies. This study examines a wide range of machine-learning-based methods to detecting Android malware covering both types dynamic and static.

Download Full-text

LSTM-Based Hierarchical Denoising Network for Android Malware Detection

Security and Communication Networks ◽

10.1155/2018/5249190 ◽

2018 ◽

Vol 2018 ◽

pp. 1-18 ◽

Cited By ~ 3

Author(s):

Jinpei Yan ◽

Yong Qi ◽

Qifan Rao

Keyword(s):

Detection Method ◽

Expert Knowledge ◽

Detection Efficiency ◽

Malware Detection ◽

Mobile Security ◽

Detection Methods ◽

Android Malware ◽

Android Malware Detection ◽

Malicious Behavior ◽

Gradient Scaling

Mobile security is an important issue on Android platform. Most malware detection methods based on machine learning models heavily rely on expert knowledge for manual feature engineering, which are still difficult to fully describe malwares. In this paper, we present LSTM-based hierarchical denoise network (HDN), a novel static Android malware detection method which uses LSTM to directly learn from the raw opcode sequences extracted from decompiled Android files. However, most opcode sequences are too long for LSTM to train due to the gradient vanishing problem. Hence, HDN uses a hierarchical structure, whose first-level LSTM parallelly computes on opcode subsequences (we called them method blocks) to learn the dense representations; then the second-level LSTM can learn and detect malware through method block sequences. Considering that malicious behavior only appears in partial sequence segments, HDN uses method block denoise module (MBDM) for data denoising by adaptive gradient scaling strategy based on loss cache. We evaluate and compare HDN with the latest mainstream researches on three datasets. The results show that HDN outperforms these Android malware detection methods,and it is able to capture longer sequence features and has better detection efficiency than N-gram-based malware detection which is similar to our method.

Download Full-text

Runtime Detection Framework for Android Malware

Mobile Information Systems ◽

10.1155/2018/8094314 ◽

2018 ◽

Vol 2018 ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

TaeGuen Kim ◽

BooJoong Kang ◽

Eul Gyu Im

Keyword(s):

Dynamic Analysis ◽

Static Analysis ◽

Suffix Tree ◽

Malware Detection ◽

Application Programming Interface ◽

Detection Methods ◽

Detection Accuracy ◽

Dynamic Features ◽

Android Malware ◽

Android Malware Detection

As the number of Android malware has been increased rapidly over the years, various malware detection methods have been proposed so far. Existing methods can be classified into two categories: static analysis-based methods and dynamic analysis-based methods. Both approaches have some limitations: static analysis-based methods are relatively easy to be avoided through transformation techniques such as junk instruction insertions, code reordering, and so on. However, dynamic analysis-based methods also have some limitations that analysis overheads are relatively high and kernel modification might be required to extract dynamic features. In this paper, we propose a dynamic analysis framework for Android malware detection that overcomes the aforementioned shortcomings. The framework uses a suffix tree that contains API (Application Programming Interface) subtraces and their probabilistic confidence values that are generated using HMMs (Hidden Markov Model) to reduce the malware detection overhead, and we designed the framework with the client-server architecture since the suffix tree is infeasible to be deployed in mobile devices. In addition, an application rewriting technique is used to trace API invocations without any modifications in the Android kernel. In our experiments, we measured the detection accuracy and the computational overheads to evaluate its effectiveness and efficiency of the proposed framework.

Download Full-text

Chimera: An Android Malware Detection Method Based on Multimodal Deep Learning and Hybrid Analysis

10.36227/techrxiv.13359767.v1 ◽

2020 ◽

Author(s):

Angelo Schranko de Oliveira ◽

Renato José Sassi

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Dynamic Analysis ◽

Detection Method ◽

Malware Detection ◽

Analysis Data ◽

Detection Methods ◽

Feature Engineering ◽

Android Malware ◽

Android Malware Detection

<div>The Android Operating System (OS) everywhere, computers, cars, homes, and, of course, personal and corporate smartphones. A recent survey from the International Data Corporation (IDC) reveals that the Android platform holds 85% of the smartphone market share. Its popularity and open nature make it an attractive target for malware. According to AV-TEST, by November 2020, 2.87M new Android malware instances were identified in the wild. Malware detection is a challenging problem that has been actively explored by both the industry and academia using intelligent methods. On the one hand, traditional machine learning (ML) malware detection methods rely on manual feature engineering that requires expert knowledge. On the other hand, deep learning (DL) malware detection methods perform automatic feature extraction but usually require much more data and processing power. In this work, we propose a new multimodal DL Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executable (DEX) grayscale images), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid, which contains static and dynamic analysis data extracted from 22,000 real malware and goodware samples. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, Recall, and ROC AUC outperform classical ML algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.</div>

Download Full-text

Performance Comparison of Android Malware Detection Methods

Journal of Physics Conference Series ◽

10.1088/1742-6596/1827/1/012176 ◽

2021 ◽

Vol 1827 (1) ◽

pp. 012176

Author(s):

Yuandi Wang

Keyword(s):

Malware Detection ◽

Performance Comparison ◽

Detection Methods ◽

Android Malware ◽

Android Malware Detection

Download Full-text