Suspicion-Free Adversarial Attacks on Clustering Algorithms

Clustering algorithms are used in a large number of applications and play an important role in modern machine learning– yet, adversarial attacks on clustering algorithms seem to be broadly overlooked unlike supervised learning. In this paper, we seek to bridge this gap by proposing a black-box adversarial attack for clustering models for linearly separable clusters. Our attack works by perturbing a single sample close to the decision boundary, which leads to the misclustering of multiple unperturbed samples, named spill-over adversarial samples. We theoretically show the existence of such adversarial samples for the K-Means clustering. Our attack is especially strong as (1) we ensure the perturbed sample is not an outlier, hence not detectable, and (2) the exact metric used for clustering is not known to the attacker. We theoretically justify that the attack can indeed be successful without the knowledge of the true metric. We conclude by providing empirical results on a number of datasets, and clustering algorithms. To the best of our knowledge, this is the first work that generates spill-over adversarial samples without the knowledge of the true metric ensuring that the perturbed sample is not an outlier, and theoretically proves the above.

Download Full-text

Adversarial Attack for Uncertainty Estimation: Identifying Critical Regions in Neural Networks

Neural Processing Letters ◽

10.1007/s11063-021-10707-3 ◽

2021 ◽

Author(s):

Ismail Alarab ◽

Simant Prakoonwit

Keyword(s):

Neural Network ◽

Machine Learning ◽

Model Uncertainty ◽

Uncertainty Estimation ◽

Decision Boundary ◽

Uncertainty Estimates ◽

Data Points ◽

Novel Method ◽

Adversarial Attack ◽

Critical Regions

AbstractWe propose a novel method to capture data points near decision boundary in neural network that are often referred to a specific type of uncertainty. In our approach, we sought to perform uncertainty estimation based on the idea of adversarial attack method. In this paper, uncertainty estimates are derived from the input perturbations, unlike previous studies that provide perturbations on the model's parameters as in Bayesian approach. We are able to produce uncertainty with couple of perturbations on the inputs. Interestingly, we apply the proposed method to datasets derived from blockchain. We compare the performance of model uncertainty with the most recent uncertainty methods. We show that the proposed method has revealed a significant outperformance over other methods and provided less risk to capture model uncertainty in machine learning.

Download Full-text

A Distributed Biased Boundary Attack Method in Black-Box Attack

Applied Sciences ◽

10.3390/app112110479 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10479

Author(s):

Fengtao Xiang ◽

Jiahui Xu ◽

Wanpeng Zhang ◽

Weidong Wang

Keyword(s):

Machine Learning ◽

Initial Point ◽

Structural Similarity ◽

Black Box ◽

Machine Learning Algorithms ◽

Experimental Comparison ◽

Point Selection ◽

Comparison Results ◽

L2 Norm ◽

Adversarial Attack

The adversarial samples threaten the effectiveness of machine learning (ML) models and algorithms in many applications. In particular, black-box attack methods are quite close to actual scenarios. Research on black-box attack methods and the generation of adversarial samples is helpful to discover the defects of machine learning models. It can strengthen the robustness of machine learning algorithms models. Such methods require queries frequently, which are less efficient. This paper has made improvements in the initial generation and the search for the most effective adversarial examples. Besides, it is found that some indicators can be used to detect attacks, which is a new foundation compared with our previous studies. Firstly, the paper proposed an algorithm to generate initial adversarial samples with a smaller L2 norm; secondly, a combination between particle swarm optimization (PSO) and biased boundary adversarial attack (BBA) is proposed. It is the PSO-BBA. Experiments are conducted on the ImageNet. The PSO-BBA is compared with the baseline method. Experimental comparison results certificate that: (1) A distributed framework for adversarial attack methods is proposed; (2) The proposed initial point selection method can reduces query numbers effectively; (3) Compared to the original BBA, the proposed PSO-BBA algorithm accelerates the convergence speed and improves the accuracy of attack accuracy; (4) The improved PSO-BBA algorithm has preferable performance on targeted and non-targeted attacks; (5) The mean structural similarity (MSSIM) can be used as the indicators of adversarial attack.

Download Full-text

Real-Time Adversarial Attacks

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/649 ◽

2019 ◽

Cited By ~ 4

Author(s):

Yuan Gong ◽

Boyang Li ◽

Christian Poellabauer ◽

Yiyu Shi

Keyword(s):

Machine Learning ◽

Real Time ◽

Machine Learning Algorithms ◽

Original Sample ◽

Target Model ◽

Data Points ◽

Past Data ◽

Adversarial Attack ◽

Modern Machine ◽

Machine Learning Models

In recent years, many efforts have demonstrated that modern machine learning algorithms are vulnerable to adversarial attacks, where small, but carefully crafted, perturbations on the input can make them fail. While these attack methods are very effective, they only focus on scenarios where the target model takes static input, i.e., an attacker can observe the entire original sample and then add a perturbation at any point of the sample. These attack approaches are not applicable to situations where the target model takes streaming input, i.e., an attacker is only able to observe past data points and add perturbations to the remaining (unobserved) data points of the input. In this paper, we propose a real-time adversarial attack scheme for machine learning models with streaming inputs.

Download Full-text

Decoding the Black Box: Extracting Explainable Decision Boundary Approximations from Machine Learning Models for Real Time Safety Assurance of the National Airspace

AIAA Scitech 2019 Forum ◽

10.2514/6.2019-0136 ◽

2019 ◽

Author(s):

Alexander Grushin ◽

Jyotirmaya Nanda ◽

Ankit Tyagi ◽

David Miller ◽

Joshua Gluck ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Black Box ◽

Decision Boundary ◽

Learning Models ◽

Safety Assurance ◽

Machine Learning Models

Download Full-text

Improved Adversarial Attack against Black-box Machine Learning Models

2020 Chinese Automation Congress (CAC) ◽

10.1109/cac51589.2020.9326714 ◽

2020 ◽

Author(s):

Jiahui Xu ◽

Chen Wang ◽

Tingting Li ◽

Fengtao Xiang

Keyword(s):

Machine Learning ◽

Black Box ◽

Learning Models ◽

Adversarial Attack ◽

Machine Learning Models

Download Full-text

Application of Machine Learning in Animal Disease Analysis and Prediction

Current Bioinformatics ◽

10.2174/1574893615999200728195613 ◽

2020 ◽

Vol 15 ◽

Author(s):

Shuwen Zhang ◽

Qiang Su ◽

Qin Chen

Keyword(s):

Machine Learning ◽

Unsupervised Learning ◽

Supervised Learning ◽

Clustering Algorithm ◽

Principal Component ◽

Support Vector ◽

Animal Disease ◽

Human Beings ◽

Animal Diseases ◽

Disease Analysis

Abstract: Major animal diseases pose a great threat to animal husbandry and human beings. With the deepening of globalization and the abundance of data resources, the prediction and analysis of animal diseases by using big data are becoming more and more important. The focus of machine learning is to make computers learn how to learn from data and use the learned experience to analyze and predict. Firstly, this paper introduces the animal epidemic situation and machine learning. Then it briefly introduces the application of machine learning in animal disease analysis and prediction. Machine learning is mainly divided into supervised learning and unsupervised learning. Supervised learning includes support vector machines, naive bayes, decision trees, random forests, logistic regression, artificial neural networks, deep learning, and AdaBoost. Unsupervised learning has maximum expectation algorithm, principal component analysis hierarchical clustering algorithm and maxent. Through the discussion of this paper, people have a clearer concept of machine learning and understand its application prospect in animal diseases.

Download Full-text

A State of Art Techniques on Machine Learning Algorithms: A Perspective of Supervised Learning Approaches in Data Classification

2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS) ◽

10.1109/iccons.2018.8663155 ◽

2018 ◽

Cited By ~ 15

Author(s):

R. Saravanan ◽

Pothula Sujatha

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Learning Algorithms ◽

Data Classification ◽

Machine Learning Algorithms ◽

Learning Approaches ◽

State Of Art ◽

Art Techniques

Download Full-text

Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data

Proceedings of the Workshop on Human-In-the-Loop Data Analytics - HILDA'19 ◽

10.1145/3328519.3329126 ◽

2019 ◽

Author(s):

Sergey Redyuk ◽

Sebastian Schelter ◽

Tammo Rukat ◽

Volker Markl ◽

Felix Biessmann

Keyword(s):

Machine Learning ◽

Black Box ◽

Learning Models ◽

Unseen Data ◽

Machine Learning Models

Download Full-text

Identifying and characterizing high-risk clusters in a heterogeneous ICU population with deep embedded clustering

Scientific Reports ◽

10.1038/s41598-021-91297-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

José Castela Forte ◽

Galiya Yeshmagambetova ◽

Maureen L. van der Grinten ◽

Bart Hiemstra ◽

Thomas Kaufmann ◽

...

Keyword(s):

Machine Learning ◽

High Risk ◽

Mortality Risk ◽

Clustering Algorithms ◽

Kidney Injury ◽

Risk Groups ◽

Admission Diagnosis ◽

Dutch Hospital ◽

Cluster Membership

AbstractCritically ill patients constitute a highly heterogeneous population, with seemingly distinct patients having similar outcomes, and patients with the same admission diagnosis having opposite clinical trajectories. We aimed to develop a machine learning methodology that identifies and provides better characterization of patient clusters at high risk of mortality and kidney injury. We analysed prospectively collected data including co-morbidities, clinical examination, and laboratory parameters from a minimally-selected population of 743 patients admitted to the ICU of a Dutch hospital between 2015 and 2017. We compared four clustering methodologies and trained a classifier to predict and validate cluster membership. The contribution of different variables to the predicted cluster membership was assessed using SHapley Additive exPlanations values. We found that deep embedded clustering yielded better results compared to the traditional clustering algorithms. The best cluster configuration was achieved for 6 clusters. All clusters were clinically recognizable, and differed in in-ICU, 30-day, and 90-day mortality, as well as incidence of acute kidney injury. We identified two high mortality risk clusters with at least 60%, 40%, and 30% increased. ICU, 30-day and 90-day mortality, and a low risk cluster with 25–56% lower mortality risk. This machine learning methodology combining deep embedded clustering and variable importance analysis, which we made publicly available, is a possible solution to challenges previously encountered by clustering analyses in heterogeneous patient populations and may help improve the characterization of risk groups in critical care.

Download Full-text

MODES: model-based optimization on distributed embedded systems

Machine Learning ◽

10.1007/s10994-021-06014-6 ◽

2021 ◽

Author(s):

Junjie Shi ◽

Jiang Bian ◽

Jakob Richter ◽

Kuan-Hsun Chen ◽

Jörg Rahnenführer ◽

...

Keyword(s):

Machine Learning ◽

Embedded Systems ◽

Learning Model ◽

Black Box ◽

Distributed Embedded Systems ◽

Data Set ◽

Individual Model ◽

Model Based ◽

Machine Learning Model ◽

Distributed Machine Learning

AbstractThe predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework $$\textit{MODES}$$ MODES that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1) $$\textit{MODES}$$ MODES -B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2) $$\textit{MODES}$$ MODES -I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate $$\textit{MODES}$$ MODES by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy ($$\textit{MODES}$$ MODES -B), run-time efficiency ($$\textit{MODES}$$ MODES -I), and statistical stability for both modes, $$\textit{MODES}$$ MODES outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.

Download Full-text