scholarly journals Android Malware Detection through Machine Learning Techniques: A Review

Author(s):  
Abikoye Oluwakemi Christiana ◽  
Benjamin Aruwa Gyunka ◽  
Akande Noah

<p class="0abstract">The open source nature of Android Operating System has attracted wider adoption of the system by multiple types of developers. This phenomenon has further fostered an exponential proliferation of devices running the Android OS into different sectors of the economy. Although this development has brought about great technological advancements and ease of doing businesses (e-commerce) and social interactions, they have however become strong mediums for the uncontrolled rising cyberattacks and espionage against business infrastructures and the individual users of these mobile devices. Different cyberattacks techniques exist but attacks through malicious applications have taken the lead aside other attack methods like social engineering. Android malware have evolved in sophistications and intelligence that they have become highly resistant to existing detection systems especially those that are signature-based. Machine learning techniques have risen to become a more competent choice for combating the kind of sophistications and novelty deployed by emerging Android malwares. The models created via machine learning methods work by first learning the existing patterns of malware behaviour and then use this knowledge to separate or identify any such similar behaviour from unknown attacks. This paper provided a comprehensive review of machine learning techniques and their applications in Android malware detection as found in contemporary literature.</p>

Android OS, which is the most prevalent operating system (OS), has enjoyed immense popularity for smart phones over the past few years. Seizing this opportunity, cybercrime will occur in the form of piracy and malware. Traditional detection does not suffice to combat newly created advanced malware. So, there is a need for smart malware detection systems to reduce malicious activities risk. Machine learning approaches have been showing promising results in classifying malware where most of the method are shallow learners like Random Forest (RF) in recent years. In this paper, we propose Deep-Droid as a deep learning framework, for detection Android malware. Hence, our Deep-Droid model is a deep learner that outperforms exiting cutting-edge machine learning approaches. All experiments performed on two datasets (Drebin-215 & Malgenome-215) to assess our Deep-Droid model. The results of experiments show the effectiveness and robustness of Deep-Droid. Our Deep-Droid model achieved accuracy over 98.5%.


2021 ◽  
Vol 30 (3) ◽  
pp. 1-38
Author(s):  
Yanjie Zhao ◽  
Li Li ◽  
Haoyu Wang ◽  
Haipeng Cai ◽  
Tegawendé F. Bissyandé ◽  
...  

Malware detection at scale in the Android realm is often carried out using machine learning techniques. State-of-the-art approaches such as DREBIN and MaMaDroid are reported to yield high detection rates when assessed against well-known datasets. Unfortunately, such datasets may include a large portion of duplicated samples, which may bias recorded experimental results and insights. In this article, we perform extensive experiments to measure the performance gap that occurs when datasets are de-duplicated. Our experimental results reveal that duplication in published datasets has a limited impact on supervised malware classification models. This observation contrasts with the finding of Allamanis on the general case of machine learning bias for big code. Our experiments, however, show that sample duplication more substantially affects unsupervised learning models (e.g., malware family clustering). Nevertheless, we argue that our fellow researchers and practitioners should always take sample duplication into consideration when performing machine-learning-based (via either supervised or unsupervised learning) Android malware detections, no matter how significant the impact might be.


2020 ◽  
Vol 07 (02) ◽  
pp. 145-159 ◽  
Author(s):  
Md. Shohel Rana ◽  
Andrew H. Sung

Android is the most well-known portable working framework having billions of dynamic clients worldwide that pulled in promoters, programmers, and cybercriminals to create malware for different purposes. As of late, wide-running inquiries have been led on malware examination and identification for Android gadgets while Android has likewise actualized different security controls to manage the malware issues, including a User ID (UID) for every application, framework authorizations. In this paper, we advance and assess various kinds of machine learning (ML) by applying ensemble-based learning systems for identifying Android malware related to a substring-based feature selection (SBFS) strategy for the classifiers. In the investigation, we have broadened our previous work where it has been seen that the ensemble-based learning techniques acquire preferred outcome over the recently revealed outcome by directing the DREBIN dataset, and in this manner they give a solid premise to building compelling instruments for Android malware detection.


Information ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 185
Author(s):  
Vasileios Kouliaridis ◽  
Georgios Kambourakis

Year after year, mobile malware attacks grow in both sophistication and diffusion. As the open source Android platform continues to dominate the market, malware writers consider it as their preferred target. Almost strictly, state-of-the-art mobile malware detection solutions in the literature capitalize on machine learning to detect pieces of malware. Nevertheless, our findings clearly indicate that the majority of existing works utilize different metrics and models and employ diverse datasets and classification features stemming from disparate analysis techniques, i.e., static, dynamic, or hybrid. This complicates the cross-comparison of the various proposed detection schemes and may also raise doubts about the derived results. To address this problem, spanning a period of the last seven years, this work attempts to schematize the so far ML-powered malware detection approaches and techniques by organizing them under four axes, namely, the age of the selected dataset, the analysis type used, the employed ML techniques, and the chosen performance metrics. Moreover, based on these axes, we introduce a converging scheme which can guide future Android malware detection techniques and provide a solid baseline to machine learning practices in this field.


Sign in / Sign up

Export Citation Format

Share Document