Android Malware Detection through Machine Learning Techniques: A Review

<p class="0abstract">The open source nature of Android Operating System has attracted wider adoption of the system by multiple types of developers. This phenomenon has further fostered an exponential proliferation of devices running the Android OS into different sectors of the economy. Although this development has brought about great technological advancements and ease of doing businesses (e-commerce) and social interactions, they have however become strong mediums for the uncontrolled rising cyberattacks and espionage against business infrastructures and the individual users of these mobile devices. Different cyberattacks techniques exist but attacks through malicious applications have taken the lead aside other attack methods like social engineering. Android malware have evolved in sophistications and intelligence that they have become highly resistant to existing detection systems especially those that are signature-based. Machine learning techniques have risen to become a more competent choice for combating the kind of sophistications and novelty deployed by emerging Android malwares. The models created via machine learning methods work by first learning the existing patterns of malware behaviour and then use this knowledge to separate or identify any such similar behaviour from unknown attacks. This paper provided a comprehensive review of machine learning techniques and their applications in Android malware detection as found in contemporary literature.</p>

Download Full-text

TFDroid: Android Malware Detection by Topics and Sensitive Data Flows Using Machine Learning Techniques

2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT) ◽

10.1109/infoct.2019.8711179 ◽

2019 ◽

Author(s):

Songhao Lou ◽

Shaoyin Cheng ◽

Jingjing Huang ◽

Fan Jiang

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Machine Learning Techniques ◽

Sensitive Data ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques ◽

Data Flows

Download Full-text

MLDroid—framework for Android malware detection using machine learning techniques

Neural Computing and Applications ◽

10.1007/s00521-020-05309-4 ◽

2020 ◽

Author(s):

Arvind Mahindru ◽

A. L. Sangal

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques

Download Full-text

A Static Feature Selection-based Android Malware Detection Using Machine Learning Techniques

2020 International Conference on Smart Electronics and Communication (ICOSEC) ◽

10.1109/icosec49089.2020.9215355 ◽

2020 ◽

Author(s):

Aviral Sangal ◽

Harsh Kumar Verma

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques ◽

Static Feature

Download Full-text

Deep-Droid: Deep Learning for Android Malware Detection

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l7889.1091220 ◽

2020 ◽

Vol 9 (12) ◽

pp. 122-125

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Malware Detection ◽

Learning Approaches ◽

Android Malware ◽

Detection Systems ◽

Learning Framework ◽

The Past ◽

Android Malware Detection ◽

Android Os

Android OS, which is the most prevalent operating system (OS), has enjoyed immense popularity for smart phones over the past few years. Seizing this opportunity, cybercrime will occur in the form of piracy and malware. Traditional detection does not suffice to combat newly created advanced malware. So, there is a need for smart malware detection systems to reduce malicious activities risk. Machine learning approaches have been showing promising results in classifying malware where most of the method are shallow learners like Random Forest (RF) in recent years. In this paper, we propose Deep-Droid as a deep learning framework, for detection Android malware. Hence, our Deep-Droid model is a deep learner that outperforms exiting cutting-edge machine learning approaches. All experiments performed on two datasets (Drebin-215 & Malgenome-215) to assess our Deep-Droid model. The results of experiments show the effectiveness and robustness of Deep-Droid. Our Deep-Droid model achieved accuracy over 98.5%.

Download Full-text

Graph Approach for android malware detection using machine learning techniques

Humanitarian and Natural Sciences Journal ◽

10.53796/hnsj21115 ◽

2021 ◽

Vol 2 (11) ◽

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques

Download Full-text

Android malware detection based on image-based features and machine learning techniques

SN Applied Sciences ◽

10.1007/s42452-020-3132-2 ◽

2020 ◽

Vol 2 (7) ◽

Author(s):

Halil Murat Ünver ◽

Khaled Bakour

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques

Download Full-text

Dynamic Permissions based Android Malware Detection using Machine Learning Techniques

Proceedings of the 10th Innovations in Software Engineering Conference on - ISEC '17 ◽

10.1145/3021460.3021485 ◽

2017 ◽

Cited By ~ 15

Author(s):

Arvind Mahindru ◽

Paramvir Singh

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques

Download Full-text

On the Impact of Sample Duplication in Machine-Learning-Based Android Malware Detection

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3446905 ◽

2021 ◽

Vol 30 (3) ◽

pp. 1-38

Author(s):

Yanjie Zhao ◽

Li Li ◽

Haoyu Wang ◽

Haipeng Cai ◽

Tegawendé F. Bissyandé ◽

...

Keyword(s):

Machine Learning ◽

Unsupervised Learning ◽

Malware Detection ◽

Experimental Results ◽

Machine Learning Techniques ◽

Detection Rates ◽

Android Malware ◽

Android Malware Detection ◽

Learning Techniques ◽

The Impact

Malware detection at scale in the Android realm is often carried out using machine learning techniques. State-of-the-art approaches such as DREBIN and MaMaDroid are reported to yield high detection rates when assessed against well-known datasets. Unfortunately, such datasets may include a large portion of duplicated samples, which may bias recorded experimental results and insights. In this article, we perform extensive experiments to measure the performance gap that occurs when datasets are de-duplicated. Our experimental results reveal that duplication in published datasets has a limited impact on supervised malware classification models. This observation contrasts with the finding of Allamanis on the general case of machine learning bias for big code. Our experiments, however, show that sample duplication more substantially affects unsupervised learning models (e.g., malware family clustering). Nevertheless, we argue that our fellow researchers and practitioners should always take sample duplication into consideration when performing machine-learning-based (via either supervised or unsupervised learning) Android malware detections, no matter how significant the impact might be.

Download Full-text

Evaluation of Advanced Ensemble Learning Techniques for Android Malware Detection

Vietnam Journal of Computer Science ◽

10.1142/s2196888820500086 ◽

2020 ◽

Vol 07 (02) ◽

pp. 145-159 ◽

Cited By ~ 1

Author(s):

Md. Shohel Rana ◽

Andrew H. Sung

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Ensemble Learning ◽

Malware Detection ◽

Learning Systems ◽

Application Framework ◽

Android Malware ◽

Security Controls ◽

Android Malware Detection ◽

Learning Techniques

Android is the most well-known portable working framework having billions of dynamic clients worldwide that pulled in promoters, programmers, and cybercriminals to create malware for different purposes. As of late, wide-running inquiries have been led on malware examination and identification for Android gadgets while Android has likewise actualized different security controls to manage the malware issues, including a User ID (UID) for every application, framework authorizations. In this paper, we advance and assess various kinds of machine learning (ML) by applying ensemble-based learning systems for identifying Android malware related to a substring-based feature selection (SBFS) strategy for the classifiers. In the investigation, we have broadened our previous work where it has been seen that the ensemble-based learning techniques acquire preferred outcome over the recently revealed outcome by directing the DREBIN dataset, and in this manner they give a solid premise to building compelling instruments for Android malware detection.

Download Full-text

A Comprehensive Survey on Machine Learning Techniques for Android Malware Detection

Information ◽

10.3390/info12050185 ◽

2021 ◽

Vol 12 (5) ◽

pp. 185

Author(s):

Vasileios Kouliaridis ◽

Georgios Kambourakis

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Malware Detection ◽

Machine Learning Techniques ◽

Android Malware ◽

Detection Techniques ◽

Android Malware Detection ◽

Mobile Malware ◽

Comprehensive Survey ◽

Mobile Malware Detection

Year after year, mobile malware attacks grow in both sophistication and diffusion. As the open source Android platform continues to dominate the market, malware writers consider it as their preferred target. Almost strictly, state-of-the-art mobile malware detection solutions in the literature capitalize on machine learning to detect pieces of malware. Nevertheless, our findings clearly indicate that the majority of existing works utilize different metrics and models and employ diverse datasets and classification features stemming from disparate analysis techniques, i.e., static, dynamic, or hybrid. This complicates the cross-comparison of the various proposed detection schemes and may also raise doubts about the derived results. To address this problem, spanning a period of the last seven years, this work attempts to schematize the so far ML-powered malware detection approaches and techniques by organizing them under four axes, namely, the age of the selected dataset, the analysis type used, the employed ML techniques, and the chosen performance metrics. Moreover, based on these axes, we introduce a converging scheme which can guide future Android malware detection techniques and provide a solid baseline to machine learning practices in this field.

Download Full-text