A Comprehensive Survey on Machine Learning Techniques for Android Malware Detection

Year after year, mobile malware attacks grow in both sophistication and diffusion. As the open source Android platform continues to dominate the market, malware writers consider it as their preferred target. Almost strictly, state-of-the-art mobile malware detection solutions in the literature capitalize on machine learning to detect pieces of malware. Nevertheless, our findings clearly indicate that the majority of existing works utilize different metrics and models and employ diverse datasets and classification features stemming from disparate analysis techniques, i.e., static, dynamic, or hybrid. This complicates the cross-comparison of the various proposed detection schemes and may also raise doubts about the derived results. To address this problem, spanning a period of the last seven years, this work attempts to schematize the so far ML-powered malware detection approaches and techniques by organizing them under four axes, namely, the age of the selected dataset, the analysis type used, the employed ML techniques, and the chosen performance metrics. Moreover, based on these axes, we introduce a converging scheme which can guide future Android malware detection techniques and provide a solid baseline to machine learning practices in this field.

Download Full-text

Android Mobile Malware Detection Using Machine Learning: A Systematic Review

Electronics ◽

10.3390/electronics10131606 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1606

Author(s):

Janaka Senanayake ◽

Harsha Kalutarage ◽

Mhd Omar Al-Kadri

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Effective Means ◽

Malware Detection ◽

Future Research ◽

Android Malware ◽

Detection Techniques ◽

Android Malware Detection ◽

Training Examples ◽

Mobile Malware Detection

With the increasing use of mobile devices, malware attacks are rising, especially on Android phones, which account for 72.2% of the total market share. Hackers try to attack smartphones with various methods such as credential theft, surveillance, and malicious advertising. Among numerous countermeasures, machine learning (ML)-based methods have proven to be an effective means of detecting these attacks, as they are able to derive a classifier from a set of training examples, thus eliminating the need for an explicit definition of the signatures when developing malware detectors. This paper provides a systematic review of ML-based Android malware detection techniques. It critically evaluates 106 carefully selected articles and highlights their strengths and weaknesses as well as potential improvements. Finally, the ML-based methods for detecting source code vulnerabilities are discussed, because it might be more difficult to add security after the app is deployed. Therefore, this paper aims to enable researchers to acquire in-depth knowledge in the field and to identify potential future research and development directions.

Download Full-text

Android Malware Detection Techniques: A Literature Review

Recent Patents on Engineering ◽

10.2174/1872212114999200710143847 ◽

2020 ◽

Vol 14 ◽

Author(s):

Meghna Dhalaria ◽

Ekta Gandotra

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Malware Detection ◽

Future Research ◽

Android Malware ◽

Detection Techniques ◽

Android Malware Detection ◽

Future Research Directions ◽

To Come ◽

Tools And Techniques

Purpose: This paper provides the basics of Android malware, its evolution and tools and techniques for malware analysis. Its main aim is to present a review of the literature on Android malware detection using machine learning and deep learning and identify the research gaps. It provides the insights obtained through literature and future research directions which could help researchers to come up with robust and accurate techniques for classification of Android malware. Design/Methodology/Approach: This paper provides a review of the basics of Android malware, its evolution timeline and detection techniques. It includes the tools and techniques for analyzing the Android malware statically and dynamically for extracting features and finally classifying these using machine learning and deep learning algorithms. Findings: The number of Android users is expanding very fast due to the popularity of Android devices. As a result, there are more risks to Android users due to the exponential growth of Android malware. On-going research aims to overcome the constraints of earlier approaches for malware detection. As the evolving malware are complex and sophisticated, earlier approaches like signature based and machine learning based are not able to identify these timely and accurately. The findings from the review shows various limitations of earlier techniques i.e. requires more detection time, high false positive and false negative rate, low accuracy in detecting sophisticated malware and less flexible. Originality/value: This paper provides a systematic and comprehensive review on the tools and techniques being employed for analysis, classification and identification of Android malicious applications. It includes the timeline of Android malware evolution, tools and techniques for analyzing these statically and dynamically for the purpose of extracting features and finally using these features for their detection and classification using machine learning and deep learning algorithms. On the basis of the detailed literature review, various research gaps are listed. The paper also provides future research directions and insights which could help researchers to come up with innovative and robust techniques for detecting and classifying the Android malware.

Download Full-text

Towards Deep Learning-Based Approach for Detecting Android Malware

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch096 ◽

2021 ◽

pp. 2193-2219

Author(s):

Jarrett Booz ◽

Josh McGiff ◽

William G. Hatcher ◽

Wei Yu ◽

James Nguyen ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Environment ◽

Malware Detection ◽

Extensive Study ◽

Detection Accuracy ◽

Android Malware ◽

Android Malware Detection ◽

Mobile Malware Detection ◽

Optimal Settings

In this article, the authors implement a deep learning environment and fine-tune parameters to determine the optimal settings for the classification of Android malware from extracted permission data. By determining the optimal settings, the authors demonstrate the potential performance of a deep learning environment for Android malware detection. Specifically, an extensive study is conducted on various hyper-parameters to determine optimal configurations, and then a performance evaluation is carried out on those configurations to compare and maximize detection accuracy in our target networks. The results achieve a detection accuracy of approximately 95%, with an approximate F1 score of 93%. In addition, the evaluation is extended to include other machine learning frameworks, specifically comparing Microsoft Cognitive Toolkit (CNTK) and Theano with TensorFlow. The future needs are discussed in the realm of machine learning for mobile malware detection, including adversarial training, scalability, and the evaluation of additional data and features.

Download Full-text

TFDroid: Android Malware Detection by Topics and Sensitive Data Flows Using Machine Learning Techniques

2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT) ◽

10.1109/infoct.2019.8711179 ◽

2019 ◽

Author(s):

Songhao Lou ◽

Shaoyin Cheng ◽

Jingjing Huang ◽

Fan Jiang

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Machine Learning Techniques ◽

Sensitive Data ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques ◽

Data Flows

Download Full-text

MLDroid—framework for Android malware detection using machine learning techniques

Neural Computing and Applications ◽

10.1007/s00521-020-05309-4 ◽

2020 ◽

Author(s):

Arvind Mahindru ◽

A. L. Sangal

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques

Download Full-text

A Static Feature Selection-based Android Malware Detection Using Machine Learning Techniques

2020 International Conference on Smart Electronics and Communication (ICOSEC) ◽

10.1109/icosec49089.2020.9215355 ◽

2020 ◽

Author(s):

Aviral Sangal ◽

Harsh Kumar Verma

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques ◽

Static Feature

Download Full-text

Towards Deep Learning-Based Approach for Detecting Android Malware

International Journal of Software Innovation ◽

10.4018/ijsi.2019100101 ◽

2019 ◽

Vol 7 (4) ◽

pp. 1-24 ◽

Cited By ~ 1

Author(s):

Jarrett Booz ◽

Josh McGiff ◽

William G. Hatcher ◽

Wei Yu ◽

James Nguyen ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Environment ◽

Malware Detection ◽

Extensive Study ◽

Detection Accuracy ◽

Android Malware ◽

Android Malware Detection ◽

Mobile Malware Detection ◽

Optimal Settings

Download Full-text

Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm

Mathematics ◽

10.3390/math9212813 ◽

2021 ◽

Vol 9 (21) ◽

pp. 2813

Author(s):

Jaehyeong Lee ◽

Hyuk Jang ◽

Sungmin Ha ◽

Yourim Yoon

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Genetic Algorithms ◽

Feature Selection ◽

Malware Detection ◽

Feature Selection Method ◽

Machine Learning Algorithms ◽

Android Malware ◽

Detection Techniques ◽

Android Malware Detection

Since the discovery that machine learning can be used to effectively detect Android malware, many studies on machine learning-based malware detection techniques have been conducted. Several methods based on feature selection, particularly genetic algorithms, have been proposed to increase the performance and reduce costs. However, because they have yet to be compared with other methods and their many features have not been sufficiently verified, such methods have certain limitations. This study investigates whether genetic algorithm-based feature selection helps Android malware detection. We applied nine machine learning algorithms with genetic algorithm-based feature selection for 1104 static features through 5000 benign applications and 2500 malwares included in the Andro-AutoPsy dataset. Comparative experimental results show that the genetic algorithm performed better than the information gain-based method, which is generally used as a feature selection method. Moreover, machine learning using the proposed genetic algorithm-based feature selection has an absolute advantage in terms of time compared to machine learning without feature selection. The results indicate that incorporating genetic algorithms into Android malware detection is a valuable approach. Furthermore, to improve malware detection performance, it is useful to apply genetic algorithm-based feature selection to machine learning.

Download Full-text

A Comprehensive Study of Malware Detection in Android Operating Systems

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v10i430248 ◽

2021 ◽

pp. 30-46

Author(s):

Suhaib Jasim Hamdi ◽

Ibrahim Mahmood Ibrahim ◽

Naaman Omar ◽

Omar M. Ahmed ◽

Zryan Najat Rashid ◽

...

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Detailed Comparison ◽

Detection Methods ◽

Current Status ◽

Android Malware ◽

Detection Techniques ◽

Android Apps ◽

Android Malware Detection ◽

Wide Range

Android is now the world's (or one of the world’s) most popular operating system. More and more malware assaults are taking place in Android applications. Many security detection techniques based on Android Apps are now available. The open environmental feature of the Android environment has given Android an extensive appeal in recent years. The growing number of mobile devices are incorporated in many aspects of our everyday lives. This paper gives a detailed comparison that summarizes and analyses various detection techniques. This work examines the current status of Android malware detection methods, with an emphasis on Machine Learning-based classifiers for detecting malicious software on Android devices. Android has a huge number of apps that may be downloaded and used for free. Consequently, Android phones are more susceptible to malware. As a result, additional research has been done in order to develop effective malware detection methods. To begin, several of the currently available Android malware detection approaches are carefully examined and classified based on their detection methodologies. This study examines a wide range of machine-learning-based methods to detecting Android malware covering both types dynamic and static.

Download Full-text

Graph Approach for android malware detection using machine learning techniques

Humanitarian and Natural Sciences Journal ◽

10.53796/hnsj21115 ◽

2021 ◽

Vol 2 (11) ◽

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques

Download Full-text