A Method for Class-Imbalance Learning in Android Malware Detection

Jun Guan; Xu Jiang; Baolei Mao

doi:10.3390/electronics10243124

A Method for Class-Imbalance Learning in Android Malware Detection

Electronics ◽

10.3390/electronics10243124 ◽

2021 ◽

Vol 10 (24) ◽

pp. 3124

Author(s):

Jun Guan ◽

Xu Jiang ◽

Baolei Mao

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Computational Cost ◽

Class Imbalance ◽

Sampling Technique ◽

Minority Class ◽

Android Malware ◽

Android Malware Detection ◽

Imbalance Learning ◽

Class Imbalance Learning

More and more Android application developers are adopting many different methods against reverse engineering, such as adding a shell, resulting in certain features that cannot be obtained through decompilation, which causes a serious sample imbalance in Android malware detection based on machine learning. Hence, the researchers have focused on how to solve class-imbalance to improve the performance of Android malware detection. However, the disadvantages of the existing class-imbalance learning are mainly the loss of valuable samples and the computational cost. In this paper, we propose a method of Class-Imbalance Learning (CIL), which first selects representative features, uses the clustering K-Means algorithm and under-sampling to retain the important samples of the majority class while reducing the number of samples of the majority class. After that, we use the Synthetic Minority Over-Sampling Technique (SMOTE) algorithm to generate minority class samples for data balance, and finally use the Random Forest (RF) algorithm to build a malware detection model. The result of experiments indicates that CIL effectively improves the performance of Android malware detection based on machine learning, especially for class imbalance. Compared with existing class-imbalance learning methods, CIL is also effective for the Machine Learning Repository from the University of California, Irvine (UCI) and has better performance in some data sets.

Download Full-text

Android Malware Detection Techniques: A Literature Review

Recent Patents on Engineering ◽

10.2174/1872212114999200710143847 ◽

2020 ◽

Vol 14 ◽

Author(s):

Meghna Dhalaria ◽

Ekta Gandotra

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Malware Detection ◽

Future Research ◽

Android Malware ◽

Detection Techniques ◽

Android Malware Detection ◽

Future Research Directions ◽

To Come ◽

Tools And Techniques

Purpose: This paper provides the basics of Android malware, its evolution and tools and techniques for malware analysis. Its main aim is to present a review of the literature on Android malware detection using machine learning and deep learning and identify the research gaps. It provides the insights obtained through literature and future research directions which could help researchers to come up with robust and accurate techniques for classification of Android malware. Design/Methodology/Approach: This paper provides a review of the basics of Android malware, its evolution timeline and detection techniques. It includes the tools and techniques for analyzing the Android malware statically and dynamically for extracting features and finally classifying these using machine learning and deep learning algorithms. Findings: The number of Android users is expanding very fast due to the popularity of Android devices. As a result, there are more risks to Android users due to the exponential growth of Android malware. On-going research aims to overcome the constraints of earlier approaches for malware detection. As the evolving malware are complex and sophisticated, earlier approaches like signature based and machine learning based are not able to identify these timely and accurately. The findings from the review shows various limitations of earlier techniques i.e. requires more detection time, high false positive and false negative rate, low accuracy in detecting sophisticated malware and less flexible. Originality/value: This paper provides a systematic and comprehensive review on the tools and techniques being employed for analysis, classification and identification of Android malicious applications. It includes the timeline of Android malware evolution, tools and techniques for analyzing these statically and dynamically for the purpose of extracting features and finally using these features for their detection and classification using machine learning and deep learning algorithms. On the basis of the detailed literature review, various research gaps are listed. The paper also provides future research directions and insights which could help researchers to come up with innovative and robust techniques for detecting and classifying the Android malware.

Download Full-text

Static, Dynamic and Intrinsic Features Based Android Malware Detection Using Machine Learning

Lecture Notes in Electrical Engineering - Proceedings of ICRIC 2019 ◽

10.1007/978-3-030-29407-6_4 ◽

2019 ◽

pp. 31-45

Author(s):

Bilal Ahmad Mantoo ◽

Surinder Singh Khurana

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection

Download Full-text

Towards Deep Learning-Based Approach for Detecting Android Malware

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch096 ◽

2021 ◽

pp. 2193-2219

Author(s):

Jarrett Booz ◽

Josh McGiff ◽

William G. Hatcher ◽

Wei Yu ◽

James Nguyen ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Environment ◽

Malware Detection ◽

Extensive Study ◽

Detection Accuracy ◽

Android Malware ◽

Android Malware Detection ◽

Mobile Malware Detection ◽

Optimal Settings

In this article, the authors implement a deep learning environment and fine-tune parameters to determine the optimal settings for the classification of Android malware from extracted permission data. By determining the optimal settings, the authors demonstrate the potential performance of a deep learning environment for Android malware detection. Specifically, an extensive study is conducted on various hyper-parameters to determine optimal configurations, and then a performance evaluation is carried out on those configurations to compare and maximize detection accuracy in our target networks. The results achieve a detection accuracy of approximately 95%, with an approximate F1 score of 93%. In addition, the evaluation is extended to include other machine learning frameworks, specifically comparing Microsoft Cognitive Toolkit (CNTK) and Theano with TensorFlow. The future needs are discussed in the realm of machine learning for mobile malware detection, including adversarial training, scalability, and the evaluation of additional data and features.

Download Full-text

MDTA: A New Approach of Supervised Machine Learning for Android Malware Detection and Threat Attribution Using Behavioral Reports

Mobile Computing and Sustainable Informatics - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-981-16-1866-6_10 ◽

2021 ◽

pp. 147-159

Author(s):

Seema Sachin Vanjire ◽

M. Lakshmi

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Supervised Machine Learning ◽

New Approach ◽

Android Malware ◽

Android Malware Detection

Download Full-text

Enhanced Android Malware Detection: An SVM-Based Machine Learning Approach

2020 IEEE International Conference on Big Data and Smart Computing (BigComp) ◽

10.1109/bigcomp48618.2020.00-96 ◽

2020 ◽

Author(s):

Hyoil Han ◽

SeungJin Lim ◽

Kyoungwon Suh ◽

Seonghyun Park ◽

Seong-je Cho ◽

...

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Learning Approach ◽

Android Malware ◽

Android Malware Detection ◽

Machine Learning Approach

Download Full-text

Android Malware Detection Based on Machine Learning

2018 4th Annual International Conference on Network and Information Systems for Computers (ICNISC) ◽

10.1109/icnisc.2018.00094 ◽

2018 ◽

Author(s):

Wang Qing-Fei ◽

Fang Xiang

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection

Download Full-text

A Review of Android Malware Detection Approaches Based on Machine Learning

IEEE Access ◽

10.1109/access.2020.3006143 ◽

2020 ◽

Vol 8 ◽

pp. 124579-124607

Author(s):

Kaijun Liu ◽

Shengwei Xu ◽

Guoai Xu ◽

Miao Zhang ◽

Dawei Sun ◽

...

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection

Download Full-text

Android Malware Detection Based on Useful API Calls and Machine Learning

2018 IEEE First International Conference on Artificial Intelligence and Knowledge Engineering (AIKE) ◽

10.1109/aike.2018.00041 ◽

2018 ◽

Cited By ~ 12

Author(s):

Jaemin Jung ◽

Hyunjin Kim ◽

Dongjin Shin ◽

Myeonggeon Lee ◽

Hyunjae Lee ◽

...

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection

Download Full-text

Android Malware Detection Using Genetic Algorithm based Optimized Feature Selection and Machine Learning

2019 42nd International Conference on Telecommunications and Signal Processing (TSP) ◽

10.1109/tsp.2019.8769039 ◽

2019 ◽

Cited By ~ 2

Author(s):

Anam Fatima ◽

Ritesh Maurya ◽

Malay Kishore Dutta ◽

Radim Burget ◽

Jan Masek

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Feature Selection ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection

Download Full-text

Application of Machine Learning Algorithms for Android Malware Detection

Proceedings of the 2018 International Conference on Computational Intelligence and Intelligent Systems - CIIS 2018 ◽

10.1145/3293475.3293489 ◽

2018 ◽

Cited By ~ 3

Author(s):

Mohsen Kakavand ◽

Mohammad Dabbagh ◽

Ali Dehghantanha

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Malware Detection ◽

Machine Learning Algorithms ◽

Android Malware ◽

Android Malware Detection

Download Full-text