Detecting Android Malware and Classifying Its Families in Large-scale Datasets

Bo Sun; Takeshi Takahashi; Tao Ban; Daisuke Inoue

doi:10.1145/3464323

Detecting Android Malware and Classifying Its Families in Large-scale Datasets

ACM Transactions on Management Information Systems ◽

10.1145/3464323 ◽

2022 ◽

Vol 13 (2) ◽

pp. 1-21

Author(s):

Bo Sun ◽

Takeshi Takahashi ◽

Tao Ban ◽

Daisuke Inoue

Keyword(s):

Large Scale ◽

Computation Time ◽

Learning Technology ◽

Android Malware ◽

Android Malware Detection ◽

Analysis Process ◽

Novel Approach ◽

Classification Evaluation ◽

Family Classification ◽

F Measure

To relieve the burden of security analysts, Android malware detection and its family classification need to be automated. There are many previous works focusing on using machine (or deep) learning technology to tackle these two important issues, but as the number of mobile applications has increased in recent years, developing a scalable and precise solution is a new challenge that needs to be addressed in the security field. Accordingly, in this article, we propose a novel approach that not only enhances the performance of both Android malware and its family classification, but also reduces the running time of the analysis process. Using large-scale datasets obtained from different sources, we demonstrate that our method is able to output a high F-measure of 99.71% with a low FPR of 0.37%. Meanwhile, the computation time for processing a 300K dataset is reduced to nearly 3.3 hours. In addition, in classification evaluation, we demonstrate that the F-measure, precision, and recall are 97.5%, 96.55%, 98.64%, respectively, when classifying 28 malware families. Finally, we compare our method with previous studies in both detection and classification evaluation. We observe that our method produces better performance in terms of its effectiveness and efficiency.

Download Full-text

Enhanced Android Malware Detection and Family Classification, using Conversation-level Network Traffic Features

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/4a/4 ◽

2020 ◽

Vol 17 (4A) ◽

pp. 607-614

Author(s):

Mohammad Abuthawabeh ◽

Khaled Mahmoud

Keyword(s):

Real World ◽

Network Traffic ◽

Malware Detection ◽

The Other ◽

Android Malware ◽

Detection Algorithms ◽

Android Malware Detection ◽

Learning Technique ◽

Massive Number ◽

Family Classification

Signature-based malware detection algorithms are facing challenges to cope with the massive number of threats in the Android environment. In this paper, conversation-level network traffic features are extracted and used in a supervised-based model. This model was used to enhance the process of Android malware detection, categorization, and family classification. The model employs the ensemble learning technique in order to select the most useful features among the extracted features. A real-world dataset called CICAndMal2017 was used in this paper. The results show that Extra-trees classifier had achieved the highest weighted accuracy percentage among the other classifiers by 87.75%, 79.97%, and 66.71%for malware detection, malware categorization, and malware family classification respectively. A comparison with another study that uses the same dataset was made. This study has achieved a significant enhancement in malware family classification and malware categorization. For malware family classification, the enhancement was 39.71% for precision and 41.09% for recall. The rate of enhancement for the Android malware categorization was 30.2% and 31.14‬% for precision and recall, respectively

Download Full-text

A Hybrid Approach for Android Malware Detection and Family Classification

International Journal of Interactive Multimedia and Artificial Intelligence ◽

10.9781/ijimai.2020.09.001 ◽

2020 ◽

Vol In Press (In Press) ◽

pp. 1

Author(s):

Meghna Dhalaria ◽

Ekta Gandotra

Keyword(s):

Hybrid Approach ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection ◽

Family Classification

Download Full-text

AdvAndMal: Adversarial Training for Android Malware Detection and Family Classification

Symmetry ◽

10.3390/sym13061081 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1081

Author(s):

Chenyue Wang ◽

Linlin Zhang ◽

Kai Zhao ◽

Xuhui Ding ◽

Xusheng Wang

Keyword(s):

Malware Detection ◽

Generative Adversarial Network ◽

Android Malware ◽

Adversarial Network ◽

Android Malware Detection ◽

System Calls ◽

Pros And Cons ◽

Adversarial Training ◽

Family Classification ◽

Detection Technologies

In recent years, Android malware has continued to evolve against detection technologies, becoming more concealed and harmful, making it difficult for existing models to resist adversarial sample attacks. At the current stage, the detection result is no longer the only criterion for evaluating the pros and cons of the model with its algorithms, it is also vital to take the model’s defensive ability against adversarial samples into consideration. In this study, we propose a general framework named AdvAndMal, which consists of a two-layer network for adversarial training to generate adversarial samples and improve the effectiveness of the classifiers in Android malware detection and family classification. The adversarial sample generation layer is composed of a conditional generative adversarial network called pix2pix, which can generate malware variants to extend the classifiers’ training set, and the malware classification layer is trained by RGB image visualized from the sequence of system calls. To evaluate the adversarial training effect of the framework, we propose the robustness coefficient, a symmetric interval i = [−1, 1], and conduct controlled experiments on the dataset to measure the robustness of the overall framework for the adversarial training. Experimental results on 12 families with the largest number of samples in the Drebin dataset show that the accuracy of the overall framework is increased from 0.976 to 0.989, and its robustness coefficient is increased from 0.857 to 0.917, which proves the effectiveness of the adversarial training method.

Download Full-text

Extensible Android Malware Detection and Family Classification Using Network-Flows and API-Calls

2019 International Carnahan Conference on Security Technology (ICCST) ◽

10.1109/ccst.2019.8888430 ◽

2019 ◽

Cited By ~ 8

Author(s):

Laya Taheri ◽

Andi Fitriah Abdul Kadir ◽

Arash Habibi Lashkari

Keyword(s):

Network Flows ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection ◽

Family Classification

Download Full-text

A Novel Approach for Android Malware Detection and Classification using Convolutional Neural Networks

Proceedings of the 15th International Conference on Software Technologies ◽

10.5220/0009822906060614 ◽

2020 ◽

Author(s):

Ahmed Lekssays ◽

Bouchaib Falah ◽

Sameer Abufardeh

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection ◽

Novel Approach ◽

Malware Detection And Classification

Download Full-text

Out-of-sample Node Representation Learning for Heterogeneous Graph in Real-time Android Malware Detection

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/576 ◽

2019 ◽

Cited By ~ 6

Author(s):

Yanfang Ye ◽

Shifu Hou ◽

Lingwei Chen ◽

Jingwei Lei ◽

Wenqiang Wan ◽

...

Keyword(s):

Real Time ◽

Large Scale ◽

Malware Detection ◽

Application Programming Interface ◽

Representation Learning ◽

Android Malware ◽

Android Apps ◽

Android Malware Detection ◽

Out Of Sample ◽

Application Programming

The increasingly sophisticated Android malware calls for new defensive techniques that are capable of protecting mobile users against novel threats. In this paper, we first extract the runtime Application Programming Interface (API) call sequences from Android apps, and then analyze higher-level semantic relations within the ecosystem to comprehensively characterize the apps. To model different types of entities (i.e., app, API, device, signature, affiliation) and rich relations among them, we present a structured heterogeneous graph (HG) for modeling. To efficiently classify nodes (e.g., apps) in the constructed HG, we propose the HG-Learning method to first obtain in-sample node embeddings and then learn representations of out-of-sample nodes without rerunning/adjusting HG embeddings at the first attempt. We later design a deep neural network classifier taking the learned HG representations as inputs for real-time Android malware detection. Comprehensive experiments on large-scale and real sample collections from Tencent Security Lab are performed to compare various baselines. Promising results demonstrate that our developed system AiDroid which integrates our proposed method outperforms others in real-time Android malware detection.

Download Full-text

Double Precision Is Not Needed for Many-Body Calculations: New Conventional Wisdom

10.26434/chemrxiv.6104804.v1 ◽

2018 ◽

Author(s):

Pavel Pokhilko ◽

Evgeny Epifanovsky ◽

Anna I. Krylov

Keyword(s):

Large Scale ◽

Computation Time ◽

Coupled Cluster ◽

Double Precision ◽

Many Body ◽

Single Precision ◽

Parallel Performance ◽

Point Representation ◽

Electron Repulsion Integrals ◽

Cluster Methods

Using single precision floating point representation reduces the size of data and computation time by a factor of two relative to double precision conventionally used in electronic structure programs. For large-scale calculations, such as those encountered in many-body theories, reduced memory footprint alleviates memory and input/output bottlenecks. Reduced size of data can lead to additional gains due to improved parallel performance on CPUs and various accelerators. However, using single precision can potentially reduce the accuracy of computed observables. Here we report an implementation of coupled-cluster and equation-of-motion coupled-cluster methods with single and double excitations in single precision. We consider both standard implementation and one using Cholesky decomposition or resolution-of-the-identity of electron-repulsion integrals. Numerical tests illustrate that when single precision is used in correlated calculations, the loss of accuracy is insignificant and pure single-precision implementation can be used for computing energies, analytic gradients, excited states, and molecular properties. In addition to pure single-precision calculations, our implementation allows one to follow a single-precision calculation by clean-up iterations, fully recovering double-precision results while retaining significant savings.

Download Full-text

Green synthesis, characterization of silver sulfide nanoparticles and antibacterial activity evaluation

10.31221/osf.io/8byuc ◽

2019 ◽

Author(s):

Chem Int

Keyword(s):

Antibacterial Activity ◽

Large Scale ◽

Research Work ◽

Silver Sulfide ◽

Morphological Properties ◽

Effective Diameter ◽

Novel Approach ◽

Transmission Electron ◽

Green Route ◽

Silver Sulfide Nanoparticles

This research work presents a facile and green route for synthesis silver sulfide (Ag2SNPs) nanoparticles from silver nitrate (AgNO3) and sodium sulfide nonahydrate (Na2S.9H2O) in the presence of rosemary leaves aqueous extract at ambient temperature (27 oC). Structural and morphological properties of Ag2SNPs nanoparticles were analyzed by X-ray diffraction (XRD) and transmission electron microscopy (TEM). The surface Plasmon resonance for Ag2SNPs was obtained around 355 nm. Ag2SNPs was spherical in shape with an effective diameter size of 14 nm. Our novel approach represents a promising and effective method to large scale synthesis of eco-friendly antibacterial activity silver sulfide nanoparticles.

Download Full-text

Android Malware Detection Techniques: A Literature Review

Recent Patents on Engineering ◽

10.2174/1872212114999200710143847 ◽

2020 ◽

Vol 14 ◽

Author(s):

Meghna Dhalaria ◽

Ekta Gandotra

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Malware Detection ◽

Future Research ◽

Android Malware ◽

Detection Techniques ◽

Android Malware Detection ◽

Future Research Directions ◽

To Come ◽

Tools And Techniques

Purpose: This paper provides the basics of Android malware, its evolution and tools and techniques for malware analysis. Its main aim is to present a review of the literature on Android malware detection using machine learning and deep learning and identify the research gaps. It provides the insights obtained through literature and future research directions which could help researchers to come up with robust and accurate techniques for classification of Android malware. Design/Methodology/Approach: This paper provides a review of the basics of Android malware, its evolution timeline and detection techniques. It includes the tools and techniques for analyzing the Android malware statically and dynamically for extracting features and finally classifying these using machine learning and deep learning algorithms. Findings: The number of Android users is expanding very fast due to the popularity of Android devices. As a result, there are more risks to Android users due to the exponential growth of Android malware. On-going research aims to overcome the constraints of earlier approaches for malware detection. As the evolving malware are complex and sophisticated, earlier approaches like signature based and machine learning based are not able to identify these timely and accurately. The findings from the review shows various limitations of earlier techniques i.e. requires more detection time, high false positive and false negative rate, low accuracy in detecting sophisticated malware and less flexible. Originality/value: This paper provides a systematic and comprehensive review on the tools and techniques being employed for analysis, classification and identification of Android malicious applications. It includes the timeline of Android malware evolution, tools and techniques for analyzing these statically and dynamically for the purpose of extracting features and finally using these features for their detection and classification using machine learning and deep learning algorithms. On the basis of the detailed literature review, various research gaps are listed. The paper also provides future research directions and insights which could help researchers to come up with innovative and robust techniques for detecting and classifying the Android malware.

Download Full-text

A Two-Layered Permission-Based Android Malware Detection Scheme

2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering ◽

10.1109/mobilecloud.2014.22 ◽

2014 ◽

Cited By ~ 32

Author(s):

Xing Liu ◽

Jiqiang Liu

Keyword(s):

Malware Detection ◽

Detection Scheme ◽

Android Malware ◽

Android Malware Detection

Download Full-text