scholarly journals Hunting Android Malware Using Multimodal Deep Learning and Hybrid Analysis Data

2021 ◽  
Author(s):  
Angelo Schranko Oliveira ◽  
Renato José Sassi

In this work, we propose a new multimodal Deep Learning (DL) Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executables (DEX)), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, and Recall outperform classical Machine Learning (ML) algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.

2020 ◽  
Author(s):  
Angelo Schranko de Oliveira ◽  
Renato José Sassi

<div>The Android Operating System (OS) everywhere, computers, cars, homes, and, of course, personal and corporate smartphones. A recent survey from the International Data Corporation (IDC) reveals that the Android platform holds 85% of the smartphone market share. Its popularity and open nature make it an attractive target for malware. According to AV-TEST, by November 2020, 2.87M new Android malware instances were identified in the wild. Malware detection is a challenging problem that has been actively explored by both the industry and academia using intelligent methods. On the one hand, traditional machine learning (ML) malware detection methods rely on manual feature engineering that requires expert knowledge. On the other hand, deep learning (DL) malware detection methods perform automatic feature extraction but usually require much more data and processing power. In this work, we propose a new multimodal DL Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executable (DEX) grayscale images), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid, which contains static and dynamic analysis data extracted from 22,000 real malware and goodware samples. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, Recall, and ROC AUC outperform classical ML algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.</div>


2020 ◽  
Author(s):  
Angelo Schranko de Oliveira ◽  
Renato José Sassi

<div>The Android Operating System (OS) everywhere, computers, cars, homes, and, of course, personal and corporate smartphones. A recent survey from the International Data Corporation (IDC) reveals that the Android platform holds 85% of the smartphone market share. Its popularity and open nature make it an attractive target for malware. According to AV-TEST, by November 2020, 2.87M new Android malware instances were identified in the wild. Malware detection is a challenging problem that has been actively explored by both the industry and academia using intelligent methods. On the one hand, traditional machine learning (ML) malware detection methods rely on manual feature engineering that requires expert knowledge. On the other hand, deep learning (DL) malware detection methods perform automatic feature extraction but usually require much more data and processing power. In this work, we propose a new multimodal DL Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executable (DEX) grayscale images), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid, which contains static and dynamic analysis data extracted from 22,000 real malware and goodware samples. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, Recall, and ROC AUC outperform classical ML algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.</div>


Author(s):  
Jarrett Booz ◽  
Josh McGiff ◽  
William G. Hatcher ◽  
Wei Yu ◽  
James Nguyen ◽  
...  

In this article, the authors implement a deep learning environment and fine-tune parameters to determine the optimal settings for the classification of Android malware from extracted permission data. By determining the optimal settings, the authors demonstrate the potential performance of a deep learning environment for Android malware detection. Specifically, an extensive study is conducted on various hyper-parameters to determine optimal configurations, and then a performance evaluation is carried out on those configurations to compare and maximize detection accuracy in our target networks. The results achieve a detection accuracy of approximately 95%, with an approximate F1 score of 93%. In addition, the evaluation is extended to include other machine learning frameworks, specifically comparing Microsoft Cognitive Toolkit (CNTK) and Theano with TensorFlow. The future needs are discussed in the realm of machine learning for mobile malware detection, including adversarial training, scalability, and the evaluation of additional data and features.


2020 ◽  
Vol 8 (5) ◽  
pp. 3292-3296

Android is susceptible to malware attacks due to its open architecture, large user base and access to its code. Mobile or android malware attacks are increasing from last year. These are common threats for every internet-accessible device. From Researchers Point of view 50% increase in cyber-attacks targeting Android Mobile phones since last year. Malware attackers increasingly turning their attention to attacking smartphones with credential-theft, surveillance, and malicious advertising. Security investigation in the android mobile system has relied on analysis for malware or threat detection using binary samples or system calls with behavior profile for malicious applications is generated and then analyzed. The resulting report is then used to detect android application malware or threats using manual features. To dispose of malicious applications in the mobile device, we propose an Android malware detection system using deep learning techniques which gives security for mobile or android. FNN(Fully-connected FeedForward Deep Neural Networks) and AutoEncoder algorithm from deep learning provide Extensive experiments on a real-world dataset that reaches to an accuracy of 95 %. These papers explain Deep learning FNN(Fully-connected FeedForward Deep Neural Networks) and AutoEncoder approach for android malware detection.


2019 ◽  
Vol 7 (4) ◽  
pp. 1-24 ◽  
Author(s):  
Jarrett Booz ◽  
Josh McGiff ◽  
William G. Hatcher ◽  
Wei Yu ◽  
James Nguyen ◽  
...  

In this article, the authors implement a deep learning environment and fine-tune parameters to determine the optimal settings for the classification of Android malware from extracted permission data. By determining the optimal settings, the authors demonstrate the potential performance of a deep learning environment for Android malware detection. Specifically, an extensive study is conducted on various hyper-parameters to determine optimal configurations, and then a performance evaluation is carried out on those configurations to compare and maximize detection accuracy in our target networks. The results achieve a detection accuracy of approximately 95%, with an approximate F1 score of 93%. In addition, the evaluation is extended to include other machine learning frameworks, specifically comparing Microsoft Cognitive Toolkit (CNTK) and Theano with TensorFlow. The future needs are discussed in the realm of machine learning for mobile malware detection, including adversarial training, scalability, and the evaluation of additional data and features.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Tianliang Lu ◽  
Yanhui Du ◽  
Li Ouyang ◽  
Qiuyu Chen ◽  
Xirui Wang

In recent years, the number of malware on the Android platform has been increasing, and with the widespread use of code obfuscation technology, the accuracy of antivirus software and traditional detection algorithms is low. Current state-of-the-art research shows that researchers started applying deep learning methods for malware detection. We proposed an Android malware detection algorithm based on a hybrid deep learning model which combines deep belief network (DBN) and gate recurrent unit (GRU). First of all, analyze the Android malware; in addition to extracting static features, dynamic behavioral features with strong antiobfuscation ability are also extracted. Then, build a hybrid deep learning model for Android malware detection. Because the static features are relatively independent, the DBN is used to process the static features. Because the dynamic features have temporal correlation, the GRU is used to process the dynamic feature sequence. Finally, the training results of DBN and GRU are input into the BP neural network, and the final classification results are output. Experimental results show that, compared with the traditional machine learning algorithms, the Android malware detection model based on hybrid deep learning algorithms has a higher detection accuracy, and it also has a better detection effect on obfuscated malware.


2020 ◽  
Vol 14 ◽  
Author(s):  
Meghna Dhalaria ◽  
Ekta Gandotra

Purpose: This paper provides the basics of Android malware, its evolution and tools and techniques for malware analysis. Its main aim is to present a review of the literature on Android malware detection using machine learning and deep learning and identify the research gaps. It provides the insights obtained through literature and future research directions which could help researchers to come up with robust and accurate techniques for classification of Android malware. Design/Methodology/Approach: This paper provides a review of the basics of Android malware, its evolution timeline and detection techniques. It includes the tools and techniques for analyzing the Android malware statically and dynamically for extracting features and finally classifying these using machine learning and deep learning algorithms. Findings: The number of Android users is expanding very fast due to the popularity of Android devices. As a result, there are more risks to Android users due to the exponential growth of Android malware. On-going research aims to overcome the constraints of earlier approaches for malware detection. As the evolving malware are complex and sophisticated, earlier approaches like signature based and machine learning based are not able to identify these timely and accurately. The findings from the review shows various limitations of earlier techniques i.e. requires more detection time, high false positive and false negative rate, low accuracy in detecting sophisticated malware and less flexible. Originality/value: This paper provides a systematic and comprehensive review on the tools and techniques being employed for analysis, classification and identification of Android malicious applications. It includes the timeline of Android malware evolution, tools and techniques for analyzing these statically and dynamically for the purpose of extracting features and finally using these features for their detection and classification using machine learning and deep learning algorithms. On the basis of the detailed literature review, various research gaps are listed. The paper also provides future research directions and insights which could help researchers to come up with innovative and robust techniques for detecting and classifying the Android malware.


Sign in / Sign up

Export Citation Format

Share Document