Less is More: A privacy-respecting Android malware classifier using federated learning

Abstract In this paper we present LiM (‘Less is More’), a malware classification framework that leverages Federated Learning to detect and classify malicious apps in a privacy-respecting manner. Information about newly installed apps is kept locally on users’ devices, so that the provider cannot infer which apps were installed by users. At the same time, input from all users is taken into account in the federated learning process and they all benefit from better classification performance. A key challenge of this setting is that users do not have access to the ground truth (i.e. they cannot correctly identify whether an app is malicious). To tackle this, LiM uses a safe semi-supervised ensemble that maximizes classification accuracy with respect to a baseline classifier trained by the service provider (i.e. the cloud). We implement LiM and show that the cloud server has F1 score of 95%, while clients have perfect recall with only 1 false positive in > 100 apps, using a dataset of 25K clean apps and 25K malicious apps, 200 users and 50 rounds of federation. Furthermore, we conduct a security analysis and demonstrate that LiM is robust against both poisoning attacks by adversaries who control half of the clients, and inference attacks performed by an honest-but-curious cloud server. Further experiments with Ma-MaDroid’s dataset confirm resistance against poisoning attacks and a performance improvement due to the federation.

Download Full-text

Identifying Android Malware Using Network-Based Approaches

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019911 ◽

2019 ◽

Vol 33 ◽

pp. 9911-9912

Author(s):

Emily Alfs ◽

Doina Caragea ◽

Nathan Albin ◽

Pietro Poggi-Corradini

Keyword(s):

Ground Truth ◽

Label Propagation ◽

Android Malware ◽

Feature Vectors ◽

Android Apps ◽

Malicious Behavior ◽

Binary Feature ◽

Significant Damage ◽

Robust Techniques ◽

Malicious Apps

The proliferation of Android apps has resulted in many malicious apps entering the market and causing significant damage. Robust techniques that determine if an app is malicious are greatly needed. We propose the use of a network-based approach to effectively separate malicious from benign apps, based on a small labeled dataset. The apps in our dataset come from the Google Play Store and have been scanned for malicious behavior using Virus Total to produce a ground truth dataset with labels malicous or benign. The apps in the resulting dataset have been represented using binary feature vectors (where the features represent permissions, intent actions, discriminative APIs, obfuscation signatures, and native code signatures). We have used the feature vectors corresponding to apps to build a weighted network that captures the “closeness” between apps. We propagate labels from the labeled apps to unlabeled apps, and evaluate the effectiveness of the proposed approach using the F1-measure. We have conducted experiments to compare three variants of the label propagation approaches on datasets that include increasingly larger amounts of labeled data. The results have shown that a variant proposed in this study gives the best results overall.

Download Full-text

PACER: Platform for Android Malware Classification, Performance Evaluation and Threat Reporting

Future Internet ◽

10.3390/fi12040066 ◽

2020 ◽

Vol 12 (4) ◽

pp. 66 ◽

Cited By ~ 1

Author(s):

Ajit Kumar ◽

Vinti Agarwal ◽

Shishir Kumar Shandilya ◽

Andrii Shalaginov ◽

Saket Upadhyay ◽

...

Keyword(s):

Machine Learning ◽

Performance Evaluation ◽

Research Output ◽

Application Programming Interface ◽

Classification Performance ◽

End Users ◽

Android Malware ◽

Detection Techniques ◽

Malware Classification ◽

Threat Reporting

Android malware has become the topmost threat for the ubiquitous and useful Android ecosystem. Multiple solutions leveraging big data and machine-learning capabilities to detect Android malware are being constantly developed. Too often, these solutions are either limited to research output or remain isolated and incapable of reaching end users or malware researchers. An earlier work named PACE (Platform for Android Malware Classification and Performance Evaluation), was introduced as a unified solution to offer open and easy implementation access to several machine-learning-based Android malware detection techniques, that makes most of the research reproducible in this domain. The benefits of PACE are offered through three interfaces: Representational State Transfer (REST) Application Programming Interface (API), Web Interface, and Android Debug Bridge (ADB) interface. These multiple interfaces enable users with different expertise such as IT administrators, security practitioners, malware researchers, etc. to use their offered services. In this paper, we propose PACER (Platform for Android Malware Classification, Performance Evaluation, and Threat Reporting), which extends PACE by adding threat intelligence and reporting functionality for the end-user device through the ADB interface. A prototype of the proposed platform is introduced, and our vision is that it will help malware analysts and end users to tackle challenges and reduce the amount of manual work.

Download Full-text

Provably Secure Authentication Approach for Data Security in Cloud Using Hashing, Encryption, and Chebyshev-Based Authentication

International Journal of Information Security and Privacy ◽

10.4018/ijisp.2022010106 ◽

2022 ◽

Vol 16 (1) ◽

pp. 0-0

Keyword(s):

Data Security ◽

Security Analysis ◽

Data Communication ◽

Computation Time ◽

Computational Time ◽

Security Issues ◽

Authentication Mechanism ◽

Cloud Server ◽

Secure Authentication ◽

Provably Secure

Secure and efficient authentication mechanism becomes a major concern in cloud computing due to the data sharing among cloud server and user through internet. This paper proposed an efficient Hashing, Encryption and Chebyshev HEC-based authentication in order to provide security among data communication. With the formal and the informal security analysis, it has been demonstrated that the proposed HEC-based authentication approach provides data security more efficiently in cloud. The proposed approach amplifies the security issues and ensures the privacy and data security to the cloud user. Moreover, the proposed HEC-based authentication approach makes the system more robust and secured and has been verified with multiple scenarios. However, the proposed authentication approach requires less computational time and memory than the existing authentication techniques. The performance revealed by the proposed HEC-based authentication approach is measured in terms of computation time and memory as 26ms, and 1878bytes for 100Kb data size, respectively.

Download Full-text

Effective and Efficient Hybrid Android Malware Classification Using Pseudo-Label Stacked Auto-Encoder

Journal of Network and Systems Management ◽

10.1007/s10922-021-09634-4 ◽

2021 ◽

Vol 30 (1) ◽

Author(s):

Samaneh Mahdavifar ◽

Dima Alhadidi ◽

Ali. A. Ghorbani

Keyword(s):

Android Malware ◽

Malware Classification

Download Full-text

Security Analysis and Improvements on a Remote Integrity Checking Scheme for Regenerating-Coding-Based Distributed Storage

Security and Communication Networks ◽

10.1155/2021/6652606 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Guangjun Liu ◽

Wangmei Guo ◽

Ximeng Liu ◽

Jinbo Xiong

Keyword(s):

Distributed Storage ◽

Security Analysis ◽

Data Integrity ◽

Integrity Checking ◽

Cloud Server ◽

Data Integrity Checking ◽

Checking Scheme ◽

Data Auditing ◽

Distributed Cloud ◽

Remote Data

Enabling remote data integrity checking with failure recovery becomes exceedingly critical in distributed cloud systems. With the properties of a lower repair bandwidth while preserving fault tolerance, regenerating coding and network coding (NC) have received much attention in the coding-based storage field. Recently, an outstanding outsourced auditing scheme named NC-Audit was proposed for regenerating-coding-based distributed storage. The scheme claimed that it can effectively achieve lightweight privacy-preserving data verification remotely for these networked distributed systems. However, our algebraic analysis shows that NC-Audit can be easily broken due to a potential defect existing in its schematic design. That is, an adversarial cloud server can forge some illegal blocks to cheat the auditor with a high probability when the coding field is large. From the perspective of algebraic security, we propose a remote data integrity checking scheme RNC-Audit by resorting to hiding partial critical information to the server without compromising system performance. Our evaluation shows that the proposed scheme has significantly lower overhead compared to the state-of-the-art schemes for distributed remote data auditing.

Download Full-text

Ensemble Machine Learning Approach for Android Malware Classification Using Hybrid Features

Advances in Intelligent Systems and Computing - Proceedings of the 10th International Conference on Computer Recognition Systems CORES 2017 ◽

10.1007/978-3-319-59162-9_20 ◽

2017 ◽

pp. 191-200 ◽

Cited By ~ 1

Author(s):

Abdurrahman Pektaş ◽

Tankut Acarman

Keyword(s):

Machine Learning ◽

Learning Approach ◽

Hybrid Features ◽

Android Malware ◽

Malware Classification ◽

Ensemble Machine Learning ◽

Machine Learning Approach

Download Full-text

A Semi-supervised Learning Approach for High Dimensional Android Malware Classification

Cyberspace Safety and Security - Lecture Notes in Computer Science ◽

10.1007/978-3-030-73671-2_3 ◽

2021 ◽

pp. 20-31

Author(s):

Qiao Shang ◽

Ni Li ◽

Qi Qi ◽

Xiao-Wei Lin

Keyword(s):

Supervised Learning ◽

High Dimensional ◽

Learning Approach ◽

Android Malware ◽

Malware Classification

Download Full-text

Machine learning based Android malware classification

Proceedings of the Conference on Research in Adaptive and Convergent Systems - RACS '19 ◽

10.1145/3338840.3355693 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yena Lee ◽

Yongmin Kim ◽

Seungyeon Lee ◽

Junyoung Heo ◽

Jiman Hong

Keyword(s):

Machine Learning ◽

Android Malware ◽

Malware Classification

Download Full-text

MobiSentry: Towards Easy and Effective Detection of Android Malware on Smartphones

Mobile Information Systems ◽

10.1155/2018/4317501 ◽

2018 ◽

Vol 2018 ◽

pp. 1-14 ◽

Cited By ~ 2

Author(s):

Bingfei Ren ◽

Chuanchang Liu ◽

Bo Cheng ◽

Jie Guo ◽

Junliang Chen

Keyword(s):

State Of The Art ◽

Classification Algorithms ◽

Defense System ◽

Android Malware ◽

Malware Classification ◽

Android Os ◽

Comprehensive Performance ◽

Supervised Classifiers ◽

Expert Analysis ◽

Time Overhead

Android platform is increasingly targeted by attackers due to its popularity and openness. Traditional defenses to malware are largely reliant on expert analysis to design the discriminative features manually, which are easy to bypass with the use of sophisticated detection avoidance techniques. Therefore, more effective and easy-to-use approaches for detection of Android malware are in demand. In this paper, we present MobiSentry, a novel lightweight defense system for malware classification and categorization on smartphones. Besides conventional static features such as permissions and API calls, MobiSentry also employs the N-gram features of operation codes (n-opcode). We present two comprehensive performance comparisons among several state-of-the-art classification algorithms with multiple evaluation metrics: (1) malware detection on 184,486 benign applications and 21,306 malware samples, and (2) malware categorization on DREBIN, the largest labeled Android malware datasets. We utilize the ensemble of these supervised classifiers to design MobiSentry, which outperforms several related approaches and gives a satisfying performance in the evaluation. Furthermore, we integrate MobiSentry with Android OS that enables smartphones with Android to extract features and to predict whether the application is benign or malicious. Experimental results on real smartphones show that users can easily and effectively protect their devices against malware through this system with a small run-time overhead.

Download Full-text

Deep Android Malware Classification with API-Based Feature Graph

2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE) ◽

10.1109/trustcom/bigdatase.2019.00047 ◽

2019 ◽

Cited By ~ 2

Author(s):

Na Huang ◽

Ming Xu ◽

Ning Zheng ◽

Tong Qiao ◽

Kim-Kwang Raymond Choo

Keyword(s):

Android Malware ◽

Malware Classification

Download Full-text