Demiguise Attack: Crafting Invisible Semantic Adversarial Perturbations with Perceptual Similarity

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/430 ◽

2021 ◽

Author(s):

Yajie Wang ◽

Shangbo Wu ◽

Wenyi Jiang ◽

Shengang Hao ◽

Yu-an Tan ◽

...

Keyword(s):

Deep Neural Networks ◽

Semantic Information ◽

State Of The Art ◽

Perceptual Similarity ◽

Success Rates ◽

Lp Norm ◽

Box Models ◽

Adversarial Examples ◽

Black Box Models ◽

Blind Spots

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples. Adversarial examples are malicious images with visually imperceptible perturbations. While these carefully crafted perturbations restricted with tight Lp norm bounds are small, they are still easily perceivable by humans. These perturbations also have limited success rates when attacking black-box models or models with defenses like noise reduction filters. To solve these problems, we propose Demiguise Attack, crafting "unrestricted" perturbations with Perceptual Similarity. Specifically, we can create powerful and photorealistic adversarial examples by manipulating semantic information based on Perceptual Similarity. Adversarial examples we generate are friendly to the human visual system (HVS), although the perturbations are of large magnitudes. We extend widely-used attacks with our approach, enhancing adversarial effectiveness impressively while contributing to imperceptibility. Extensive experiments show that the proposed method not only outperforms various state-of-the-art attacks in terms of fooling rate, transferability, and robustness against defenses but can also improve attacks effectively. In addition, we also notice that our implementation can simulate illumination and contrast changes that occur in real-world scenarios, which will contribute to exposing the blind spots of DNNs.

Download Full-text

Group-Wise Dynamic Dropout Based on Latent Semantic Variations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6782 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11229-11236

Author(s):

Zhiwei Ke ◽

Zhiwei Wen ◽

Weicheng Xie ◽

Yi Wang ◽

Linlin Shen

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Semantic Information ◽

State Of The Art ◽

Classification Performance ◽

Network Robustness ◽

Feature Detectors ◽

Data Points ◽

Adversarial Examples ◽

Public Datasets

Dropout regularization has been widely used in various deep neural networks to combat overfitting. It works by training a network to be more robust on information-degraded data points for better generalization. Conventional dropout and variants are often applied to individual hidden units in a layer to break up co-adaptations of feature detectors. In this paper, we propose an adaptive dropout to reduce the co-adaptations in a group-wise manner by coarse semantic information to improve feature discriminability. In particular, we showed that adjusting the dropout probability based on local feature densities can not only improve the classification performance significantly but also enhance the network robustness against adversarial examples in some cases. The proposed approach was evaluated in comparison with the baseline and several state-of-the-art adaptive dropouts over four public datasets of Fashion-MNIST, CIFAR-10, CIFAR-100 and SVHN.

Download Full-text

Cycle-Consistent Adversarial GAN: The Integration of Adversarial Attack and Defense

Security and Communication Networks ◽

10.1155/2020/3608173 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Lingyun Jiang ◽

Kai Qiao ◽

Ruoxi Qin ◽

Linyuan Wang ◽

Wanting Yu ◽

...

Keyword(s):

Deep Learning ◽

Deep Neural Networks ◽

State Of The Art ◽

Small Magnitude ◽

Defense Strategies ◽

Adversarial Examples ◽

Adversarial Attack ◽

Public Datasets ◽

Attack And Defense

In image classification of deep learning, adversarial examples where input is intended to add small magnitude perturbations may mislead deep neural networks (DNNs) to incorrect results, which means DNNs are vulnerable to them. Different attack and defense strategies have been proposed to better research the mechanism of deep learning. However, those researches in these networks are only for one aspect, either an attack or a defense. There is in the improvement of offensive and defensive performance, and it is difficult to promote each other in the same framework. In this paper, we propose Cycle-Consistent Adversarial GAN (CycleAdvGAN) to generate adversarial examples, which can learn and approximate the distribution of the original instances and adversarial examples, especially promoting attackers and defenders to confront each other and improve their ability. For CycleAdvGAN, once the GeneratorA and D are trained, GA can generate adversarial perturbations efﬁciently for any instance, improving the performance of the existing attack methods, and GD can generate recovery adversarial examples to clean instances, defending against existing attack methods. We apply CycleAdvGAN under semiwhite-box and black-box settings on two public datasets MNIST and CIFAR10. Using the extensive experiments, we show that our method has achieved the state-of-the-art adversarial attack method and also has efficiently improved the defense ability, which made the integration of adversarial attack and defense come true. In addition, it has improved the attack effect only trained on the adversarial dataset generated by any kind of adversarial attack.

Download Full-text

Generating Adversarial Examples with Adversarial Networks

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/543 ◽

2018 ◽

Cited By ~ 65

Author(s):

Chaowei Xiao ◽

Bo Li ◽

Jun-yan Zhu ◽

Warren He ◽

Mingyan Liu ◽

...

Keyword(s):

Deep Neural Networks ◽

State Of The Art ◽

Black Box ◽

Generative Adversarial Networks ◽

Perceptual Quality ◽

Small Magnitude ◽

Adversarial Networks ◽

Original Target ◽

Adversarial Examples ◽

Adversarial Training

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial exam- ples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply Adv- GAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.

Download Full-text

Spoofing Speaker Verification System by Adversarial Examples Leveraging the Generalized Speaker Difference

Security and Communication Networks ◽

10.1155/2021/6664578 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hongwei Luo ◽

Yijie Shen ◽

Feng Lin ◽

Guoai Xu

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

State Of The Art ◽

Speaker Verification ◽

Signal To Noise Ratio ◽

The State ◽

Verification System ◽

Adversarial Examples ◽

Human Hearing

Speaker verification system has gained great popularity in recent years, especially with the development of deep neural networks and Internet of Things. However, the security of speaker verification system based on deep neural networks has not been well investigated. In this paper, we propose an attack to spoof the state-of-the-art speaker verification system based on generalized end-to-end (GE2E) loss function for misclassifying illegal users into the authentic user. Specifically, we design a novel loss function to deploy a generator for generating effective adversarial examples with slight perturbation and then spoof the system with these adversarial examples to achieve our goals. The success rate of our attack can reach 82% when cosine similarity is adopted to deploy the deep-learning-based speaker verification system. Beyond that, our experiments also reported the signal-to-noise ratio at 76 dB, which proves that our attack has higher imperceptibility than previous works. In summary, the results show that our attack not only can spoof the state-of-the-art neural-network-based speaker verification system but also more importantly has the ability to hide from human hearing or machine discrimination.

Download Full-text

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5767 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3601-3608

Author(s):

Minhao Cheng ◽

Jinfeng Yi ◽

Pin-Yu Chen ◽

Huan Zhang ◽

Cho-Jui Hsieh

Keyword(s):

Deep Neural Networks ◽

Classification Problem ◽

Text Summarization ◽

Loss Functions ◽

Challenging Problem ◽

Success Rates ◽

Projected Gradient Method ◽

Input Space ◽

Adversarial Examples ◽

Output Space

Crafting adversarial examples has become an important technique to evaluate the robustness of deep neural networks (DNNs). However, most existing works focus on attacking the image classification problem since its input space is continuous and output space is finite. In this paper, we study the much more challenging problem of crafting adversarial examples for sequence-to-sequence (seq2seq) models, whose inputs are discrete text strings and outputs have an almost infinite number of possibilities. To address the challenges caused by the discrete input space, we propose a projected gradient method combined with group lasso and gradient regularization. To handle the almost infinite output space, we design some novel loss functions to conduct non-overlapping attack and targeted keyword attack. We apply our algorithm to machine translation and text summarization tasks, and verify the effectiveness of the proposed algorithm: by changing less than 3 words, we can make seq2seq model to produce desired outputs with high success rates. We also use an external sentiment classifier to verify the property of preserving semantic meanings for our generated adversarial examples. On the other hand, we recognize that, compared with the well-evaluated CNN-based classifiers, seq2seq models are intrinsically more robust to adversarial attacks.

Download Full-text

Adv-Makeup: A New Imperceptible and Transferable Attack on Face Recognition

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/173 ◽

2021 ◽

Author(s):

Bangjie Yin ◽

Wenxuan Wang ◽

Taiping Yao ◽

Junfeng Guo ◽

Zelun Kong ◽

...

Keyword(s):

Face Recognition ◽

Black Box ◽

Fine Grained ◽

Box Models ◽

Meta Learning ◽

Adversarial Examples ◽

Adversarial Attack ◽

Face Generation ◽

Orbital Region ◽

Black Box Models

Deep neural networks, particularly face recognition models, have been shown to be vulnerable to both digital and physical adversarial examples. However, existing adversarial examples against face recognition systems either lack transferability to black-box models, or fail to be implemented in practice. In this paper, we propose a unified adversarial face generation method - Adv-Makeup, which can realize imperceptible and transferable attack under the black-box setting. Adv-Makeup develops a task-driven makeup generation method with the blending module to synthesize imperceptible eye shadow over the orbital region on faces. And to achieve transferability, Adv-Makeup implements a fine-grained meta-learning based adversarial attack strategy to learn more vulnerable or sensitive features from various models. Compared to existing techniques, sufficient visualization results demonstrate that Adv-Makeup is capable to generate much more imperceptible attacks under both digital and physical scenarios. Meanwhile, extensive quantitative experiments show that Adv-Makeup can significantly improve the attack success rate under black-box setting, even attacking commercial systems.

Download Full-text

Learning robust features by extended generative stochastic networks

International Journal of Modeling Simulation and Scientific Computing ◽

10.1142/s1793962318500046 ◽

2018 ◽

Vol 09 (01) ◽

pp. 1850004

Author(s):

Da Teng ◽

Xiao Song ◽

Guanghong Gong ◽

Junhua Zhou

Keyword(s):

Neural Networks ◽

Object Recognition ◽

Deep Neural Networks ◽

State Of The Art ◽

Random Noise ◽

Stochastic Networks ◽

Experimental Results ◽

Feedforward Networks ◽

Adversarial Examples ◽

Art Performance

Deep neural networks have achieved state-of-the-art performance on many object recognition tasks, but they are vulnerable to small adversarial perturbations. In this paper, several extensions of generative stochastic networks (GSNs) are proposed to improve the robustness of neural networks to random noise and adversarial perturbations. Experimental results show that compared to normal GSN method, the extensions using adversarial examples, lateral connections and feedforward networks can improve the performance of GSNs by making the models more resistant to overfitting and noise.

Download Full-text

Boosting Targeted Black-Box Attacks via Ensemble Substitute Training and Linear Augmentation

Applied Sciences ◽

10.3390/app9112286 ◽

2019 ◽

Vol 9 (11) ◽

pp. 2286 ◽

Cited By ~ 6

Author(s):

Xianfeng Gao ◽

Yu-an Tan ◽

Hongwei Jiang ◽

Quanxin Zhang ◽

Xiaohui Kuang

Keyword(s):

Neural Networks ◽

Image Classification ◽

Deep Neural Networks ◽

Black Box ◽

Decision Boundary ◽

Success Rates ◽

Small Perturbations ◽

Targeted Attacks ◽

Adversarial Examples ◽

Effectiveness And Efficiency

These years, Deep Neural Networks (DNNs) have shown unprecedented performance in many areas. However, some recent studies revealed their vulnerability to small perturbations added on source inputs. Furthermore, we call the ways to generate these perturbations’ adversarial attacks, which contain two types, black-box and white-box attacks, according to the adversaries’ access to target models. In order to overcome the problem of black-box attackers’ unreachabilities to the internals of target DNN, many researchers put forward a series of strategies. Previous works include a method of training a local substitute model for the target black-box model via Jacobian-based augmentation and then use the substitute model to craft adversarial examples using white-box methods. In this work, we improve the dataset augmentation to make the substitute models better fit the decision boundary of the target model. Unlike the previous work that just performed the non-targeted attack, we make it first to generate targeted adversarial examples via training substitute models. Moreover, to boost the targeted attacks, we apply the idea of ensemble attacks to the substitute training. Experiments on MNIST and GTSRB, two common datasets for image classification, demonstrate our effectiveness and efficiency of boosting a targeted black-box attack, and we finally attack the MNIST and GTSRB classifiers with the success rates of 97.7% and 92.8%.

Download Full-text

To trust or not to trust an explanation: using LEAF to evaluate local linear XAI methods

PeerJ Computer Science ◽

10.7717/peerj-cs.479 ◽

2021 ◽

Vol 7 ◽

pp. e479

Author(s):

Elvio Amparore ◽

Alan Perotti ◽

Paolo Bajardi

Keyword(s):

Artificial Intelligence ◽

Decision Support ◽

State Of The Art ◽

Black Box ◽

Evaluation Procedures ◽

Local Linear ◽

Box Models ◽

Explainable Artificial Intelligence ◽

The Many ◽

Black Box Models

The main objective of eXplainable Artificial Intelligence (XAI) is to provide effective explanations for black-box classifiers. The existing literature lists many desirable properties for explanations to be useful, but there is a scarce consensus on how to quantitatively evaluate explanations in practice. Moreover, explanations are typically used only to inspect black-box models, and the proactive use of explanations as a decision support is generally overlooked. Among the many approaches to XAI, a widely adopted paradigm is Local Linear Explanations—with LIME and SHAP emerging as state-of-the-art methods. We show that these methods are plagued by many defects including unstable explanations, divergence of actual implementations from the promised theoretical properties, and explanations for the wrong label. This highlights the need to have standard and unbiased evaluation procedures for Local Linear Explanations in the XAI field. In this paper we address the problem of identifying a clear and unambiguous set of metrics for the evaluation of Local Linear Explanations. This set includes both existing and novel metrics defined specifically for this class of explanations. All metrics have been included in an open Python framework, named LEAF. The purpose of LEAF is to provide a reference for end users to evaluate explanations in a standardised and unbiased way, and to guide researchers towards developing improved explainable techniques.

Download Full-text

HaS-Net: A Heal and Select Mechanism to Securely Train DNNs against Backdoor Attacks

10.36227/techrxiv.16571184 ◽

2021 ◽

Author(s):

Hassan Ali ◽

Surya Nepal ◽

Salil S. Kanhere ◽

Sanjay K. Jha

Keyword(s):

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Training Data ◽

Network Architectures ◽

Success Rates ◽

Defense Strategies ◽

Consumer Complaint ◽

Training Samples ◽

Novel Variant

<div>We have witnessed the continuing arms race between backdoor attacks and the corresponding defense strategies on Deep Neural Networks (DNNs). However, most state-of-the-art defenses rely on the statistical sanitization of inputs or latent DNN representations to capture trojan behavior. In this paper, we first challenge the robustness of many recently reported defenses by introducing a novel variant of the targeted backdoor attack, called low-confidence backdoor attack. Low-confidence attack inserts the backdoor by assigning uniformly distributed probabilistic labels to the poisoned training samples, and is applicable to many practical scenarios such as Federated Learning and model-reuse cases. We evaluate our attack against five state-of-the-art defense methods, viz., STRIP, Gradient-Shaping, Februus, ULP-defense and ABS-defense, under the same threat model as assumed by the respective defenses and achieve Attack Success Rates (ASRs) of 99\%, 63.73%, 91.2%, 80% and 100%, respectively. After carefully studying the properties of the state-of-the-art attacks, including low-confidence attacks, we present HaS-Net, a mechanism to securely train DNNs against a number of backdoor attacks under the data-collection scenario. For this purpose, we use a reasonably small healing dataset, approximately 2% to 15% the size of training data, to heal the network at each iteration. We evaluate our defense for different datasets---Fashion-MNIST, CIFAR-10, Celebrity Face, Consumer Complaint and Urban Sound---and network architectures---MLPs, 2D-CNNs, 1D-CNNs---and against several attack configurations---standard backdoor attacks, invisible backdoor attacks, label-consistent attack and all-trojan backdoor attack, including their low-confidence variants. Our experiments show that HaS-Nets can decrease ASRs from over 90% to less than 15%, independent of the dataset, attack configuration and network architecture.</div>

Download Full-text