Comparative Analysis of Feature Selection Methods and Machine Learning Algorithms in Permission based Android Malware Detection

Since the discovery that machine learning can be used to effectively detect Android malware, many studies on machine learning-based malware detection techniques have been conducted. Several methods based on feature selection, particularly genetic algorithms, have been proposed to increase the performance and reduce costs. However, because they have yet to be compared with other methods and their many features have not been sufficiently verified, such methods have certain limitations. This study investigates whether genetic algorithm-based feature selection helps Android malware detection. We applied nine machine learning algorithms with genetic algorithm-based feature selection for 1104 static features through 5000 benign applications and 2500 malwares included in the Andro-AutoPsy dataset. Comparative experimental results show that the genetic algorithm performed better than the information gain-based method, which is generally used as a feature selection method. Moreover, machine learning using the proposed genetic algorithm-based feature selection has an absolute advantage in terms of time compared to machine learning without feature selection. The results indicate that incorporating genetic algorithms into Android malware detection is a valuable approach. Furthermore, to improve malware detection performance, it is useful to apply genetic algorithm-based feature selection to machine learning.

Download Full-text

A Comparative Analysis of Feature Selection Methods and Associated Machine Learning Algorithms on Wisconsin Breast Cancer Dataset (WBCD)

Advances in Intelligent Systems and Computing - Proceedings of International Conference on ICT for Sustainable Development ◽

10.1007/978-981-10-0129-1_23 ◽

2016 ◽

pp. 215-224 ◽

Cited By ~ 3

Author(s):

Nileshkumar Modi ◽

Kaushar Ghanchi

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Feature Selection ◽

Comparative Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Breast Cancer Dataset ◽

Selection Methods ◽

Cancer Dataset

Download Full-text

A Survey on Android Malware Detection Techniques Using Machine Learning Algorithms.

2019 Sixth International Conference on Software Defined Systems (SDS) ◽

10.1109/sds.2019.8768729 ◽

2019 ◽

Cited By ~ 2

Author(s):

Ebtesam J. Alqahtani ◽

Rachid Zagrouba ◽

Abdullah Almuhaideb

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Malware Detection ◽

Machine Learning Algorithms ◽

Android Malware ◽

Detection Techniques ◽

Android Malware Detection

Download Full-text

Preliminary Results of Applying Machine Learning Algorithms to Android Malware Detection

2016 International Conference on Computational Science and Computational Intelligence (CSCI) ◽

10.1109/csci.2016.0204 ◽

2016 ◽

Cited By ~ 3

Author(s):

Matthew Leeds ◽

Travis Atkison

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Malware Detection ◽

Machine Learning Algorithms ◽

Android Malware ◽

Preliminary Results ◽

Android Malware Detection

Download Full-text

Sentiment Analysis of Movie Reviews: A Study of Machine Learning Algorithms with Various Feature Selection Methods

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v5i9.113121 ◽

2017 ◽

Vol 5 (9) ◽

Cited By ~ 1

Author(s):

Rajwinder Kaur

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Selection Methods

Download Full-text

Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance

Journal of Water Process Engineering ◽

10.1016/j.jwpe.2021.102033 ◽

2021 ◽

Vol 41 ◽

pp. 102033

Author(s):

Faramarz Bagherzadeh ◽

Mohamad-Javad Mehrani ◽

Milad Basirifard ◽

Javad Roostaei

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Wastewater Treatment ◽

Comparative Study ◽

Total Nitrogen ◽

Wastewater Treatment Plant ◽

Learning Algorithms ◽

Treatment Plant ◽

Machine Learning Algorithms ◽

Selection Methods

Download Full-text

Techniques for Detecting Malware Traffic: A Comprehensive Approach to Feature Selection and Classification

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39088 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1-10

Author(s):

Harsha A K

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Malware Detection ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Steady Increase ◽

Extreme Gradient Boosting

Abstract: Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.

Download Full-text