scholarly journals Opportunistic Use of Crowdsourced Workers for Online Relabeling of Potential Adversarial Examples

Author(s):  
Shawqi Al-Maliki ◽  
Faissal El Bouanani ◽  
Kashif Ahmad ◽  
Mohamed Abdallah ◽  
Dinh Hoang ◽  
...  

<div>Deep Neural Networks (DDNs) have achieved tremendous success in handling various Machine Learning (ML) tasks, such as speech recognition, Natural Language Processing, and image classification. However, they have shown vulnerability to well-designed inputs called adversarial examples. Researchers in industry and academia have proposed many adversarial example defense techniques. However, none can provide complete robustness. The cutting-edge defense techniques offer partial reliability. Thus, complementing them with another layer of protection is a must, especially for mission-critical applications. This paper proposes a novel Online Selection and Relabeling Algorithm (OSRA) that opportunistically utilizes a limited number of crowdsourced workers (budget-constraint crowdsourcing) to maximize the ML system’s robustness. OSRA strives to use crowdsourced workers effectively by selecting the most suspicious inputs (the potential adversarial examples) and moving them to the crowdsourced workers to be validated and corrected (relabeled). As a result, the impact of adversarial examples gets reduced, and accordingly, the ML system becomes more robust. We also proposed a heuristic threshold selection method that contributes to enhancing the prediction system’s reliability. We empirically validated our proposed algorithm and found that it can efficiently and optimally utilize the allocated budget for crowdsourcing. It is also effectively integrated with a state-ofthe- art black-box (transfer-based) defense technique, resulting in a more robust system. Simulation results show that OSRA can outperform a random selection algorithm by 60% and achieve comparable performance to an optimal offline selection benchmark. They also show that OSRA’s performance has a positive correlation with system robustness.<br></div>

2021 ◽  
Author(s):  
Shawqi Al-Maliki ◽  
Faissal El Bouanani ◽  
Kashif Ahmad ◽  
Mohamed Abdallah ◽  
Dinh Hoang ◽  
...  

<div>Deep Neural Networks (DDNs) have achieved tremendous success in handling various Machine Learning (ML) tasks, such as speech recognition, Natural Language Processing, and image classification. However, they have shown vulnerability to well-designed inputs called adversarial examples. Researchers in industry and academia have proposed many adversarial example defense techniques. However, none can provide complete robustness. The cutting-edge defense techniques offer partial reliability. Thus, complementing them with another layer of protection is a must, especially for mission-critical applications. This paper proposes a novel Online Selection and Relabeling Algorithm (OSRA) that opportunistically utilizes a limited number of crowdsourced workers (budget-constraint crowdsourcing) to maximize the ML system’s robustness. OSRA strives to use crowdsourced workers effectively by selecting the most suspicious inputs (the potential adversarial examples) and moving them to the crowdsourced workers to be validated and corrected (relabeled). As a result, the impact of adversarial examples gets reduced, and accordingly, the ML system becomes more robust. We also proposed a heuristic threshold selection method that contributes to enhancing the prediction system’s reliability. We empirically validated our proposed algorithm and found that it can efficiently and optimally utilize the allocated budget for crowdsourcing. It is also effectively integrated with a state-ofthe- art black-box (transfer-based) defense technique, resulting in a more robust system. Simulation results show that OSRA can outperform a random selection algorithm by 60% and achieve comparable performance to an optimal offline selection benchmark. They also show that OSRA’s performance has a positive correlation with system robustness.<br></div>


2020 ◽  
Vol 10 (22) ◽  
pp. 8079
Author(s):  
Sanglee Park ◽  
Jungmin So

State-of-the-art neural network models are actively used in various fields, but it is well-known that they are vulnerable to adversarial example attacks. Throughout the efforts to make the models robust against adversarial example attacks, it has been found to be a very difficult task. While many defense approaches were shown to be not effective, adversarial training remains as one of the promising methods. In adversarial training, the training data are augmented by “adversarial” samples generated using an attack algorithm. If the attacker uses a similar attack algorithm to generate adversarial examples, the adversarially trained network can be quite robust to the attack. However, there are numerous ways of creating adversarial examples, and the defender does not know what algorithm the attacker may use. A natural question is: Can we use adversarial training to train a model robust to multiple types of attack? Previous work have shown that, when a network is trained with adversarial examples generated from multiple attack methods, the network is still vulnerable to white-box attacks where the attacker has complete access to the model parameters. In this paper, we study this question in the context of black-box attacks, which can be a more realistic assumption for practical applications. Experiments with the MNIST dataset show that adversarially training a network with an attack method helps defending against that particular attack method, but has limited effect for other attack methods. In addition, even if the defender trains a network with multiple types of adversarial examples and the attacker attacks with one of the methods, the network could lose accuracy to the attack if the attacker uses a different data augmentation strategy on the target network. These results show that it is very difficult to make a robust network using adversarial training, even for black-box settings where the attacker has restricted information on the target network.


Symmetry ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 738 ◽  
Author(s):  
Hyun Kwon ◽  
Yongchul Kim ◽  
Hyunsoo Yoon ◽  
Daeseon Choi

Deep neural networks (DNNs) have demonstrated remarkable performance in machine learning areas such as image recognition, speech recognition, intrusion detection, and pattern analysis. However, it has been revealed that DNNs have weaknesses in the face of adversarial examples, which are created by adding a little noise to an original sample to cause misclassification by the DNN. Such adversarial examples can lead to fatal accidents in applications such as autonomous vehicles and disease diagnostics. Thus, the generation of adversarial examples has attracted extensive research attention recently. An adversarial example is categorized as targeted or untargeted. In this paper, we focus on the untargeted adversarial example scenario because it has a faster learning time and less distortion compared with the targeted adversarial example. However, there is a pattern vulnerability with untargeted adversarial examples: Because of the similarity between the original class and certain specific classes, it may be possible for the defending system to determine the original class by analyzing the output classes of the untargeted adversarial examples. To overcome this problem, we propose a new method for generating untargeted adversarial examples, one that uses an arbitrary class in the generation process. Moreover, we show that our proposed scheme can be applied to steganography. Through experiments, we show that our proposed scheme can achieve a 100% attack success rate with minimum distortion (1.99 and 42.32 using the MNIST and CIFAR10 datasets, respectively) and without the pattern vulnerability. Using a steganography test, we show that our proposed scheme can be used to fool humans, as demonstrated by the probability of their detecting hidden classes being equal to that of random selection.


Author(s):  
Chaowei Xiao ◽  
Bo Li ◽  
Jun-yan Zhu ◽  
Warren He ◽  
Mingyan Liu ◽  
...  

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial exam- ples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply Adv- GAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.


2021 ◽  
Vol 13 (11) ◽  
pp. 288
Author(s):  
Li Fan ◽  
Wei Li ◽  
Xiaohui Cui

Many deepfake-image forensic detectors have been proposed and improved due to the development of synthetic techniques. However, recent studies show that most of these detectors are not immune to adversarial example attacks. Therefore, understanding the impact of adversarial examples on their performance is an important step towards improving deepfake-image detectors. This study developed an anti-forensics case study of two popular general deepfake detectors based on their accuracy and generalization. Herein, we propose the Poisson noise DeepFool (PNDF), an improved iterative adversarial examples generation method. This method can simply and effectively attack forensics detectors by adding perturbations to images in different directions. Our attacks can reduce its AUC from 0.9999 to 0.0331, and the detection accuracy of deepfake images from 0.9997 to 0.0731. Compared with state-of-the-art studies, our work provides an important defense direction for future research on deepfake-image detectors, by focusing on the generalization performance of detectors and their resistance to adversarial example attacks.


2021 ◽  
Author(s):  
Yidong Chai ◽  
Ruicheng Liang ◽  
Hongyi Zhu ◽  
Sagar Samtani ◽  
Meng Wang ◽  
...  

Deep learning models have significantly advanced various natural language processing tasks. However, they are strikingly vulnerable to adversarial text attacks, even in the black-box setting where no model knowledge is accessible to hackers. Such attacks are conducted with a two-phase framework: 1) a sensitivity estimation phase to evaluate each element’s sensitivity to the target model’s prediction, and 2) a perturbation execution phase to craft the adversarial examples based on estimated element sensitivity. This study explored the connections between the local post-hoc explainable methods for deep learning and black-box adversarial text attacks and proposed a novel eXplanation-based method for crafting Adversarial Text Attacks (XATA). XATA leverages local post-hoc explainable methods (e.g., LIME or SHAP) to measure input elements’ sensitivity and adopts the word replacement perturbation strategy to craft adversarial examples. We evaluated the attack performance of the proposed XATA on three commonly used text-based datasets: IMDB Movie Review, Yelp Reviews-Polarity, and Amazon Reviews-Polarity. The proposed XATA outperformed existing baselines in various target models, including LSTM, GRU, CNN, and BERT. Moreover, we found that improved local post-hoc explainable methods (e.g., SHAP) lead to more effective adversarial attacks. These findings showed that when researchers constantly advance the explainability of deep learning models with local post-hoc methods, they also provide hackers with weapons to craft more targeted and dangerous adversarial attacks.


2021 ◽  
Vol 72 ◽  
pp. 1-37
Author(s):  
Mike Wu ◽  
Sonali Parbhoo ◽  
Michael C. Hughes ◽  
Volker Roth ◽  
Finale Doshi-Velez

Deep models have advanced prediction in many domains, but their lack of interpretability  remains a key barrier to the adoption in many real world applications. There exists a large  body of work aiming to help humans understand these black box functions to varying levels  of granularity – for example, through distillation, gradients, or adversarial examples. These  methods however, all tackle interpretability as a separate process after training. In this  work, we take a different approach and explicitly regularize deep models so that they are  well-approximated by processes that humans can step through in little time. Specifically,  we train several families of deep neural networks to resemble compact, axis-aligned decision  trees without significant compromises in accuracy. The resulting axis-aligned decision  functions uniquely make tree regularized models easy for humans to interpret. Moreover,  for situations in which a single, global tree is a poor estimator, we introduce a regional tree regularizer that encourages the deep model to resemble a compact, axis-aligned decision  tree in predefined, human-interpretable contexts. Using intuitive toy examples, benchmark  image datasets, and medical tasks for patients in critical care and with HIV, we demonstrate  that this new family of tree regularizers yield models that are easier for humans to simulate  than L1 or L2 penalties without sacrificing predictive power. 


2020 ◽  
Vol 39 (5) ◽  
pp. 7085-7095
Author(s):  
Shuqi Liu ◽  
Mingwen Shao ◽  
Xinping Liu

In recent years, deep neural networks have made significant progress in image classification, object detection and face recognition. However, they still have the problem of misclassification when facing adversarial examples. In order to address security issue and improve the robustness of the neural network, we propose a novel defense network based on generative adversarial network (GAN). The distribution of clean - and adversarial examples are matched to solve the mentioned problem. This guides the network to remove invisible noise accurately, and restore the adversarial example to a clean example to achieve the effect of defense. In addition, in order to maintain the classification accuracy of clean examples and improve the fidelity of neural network, we input clean examples into proposed network for denoising. Our method can effectively remove the noise of the adversarial examples, so that the denoised adversarial examples can be correctly classified. In this paper, extensive experiments are conducted on five benchmark datasets, namely MNIST, Fashion-MNIST, CIFAR10, CIFAR100 and ImageNet. Moreover, six mainstream attack methods are adopted to test the robustness of our defense method including FGSM, PGD, MIM, JSMA, CW and Deep-Fool. Results show that our method has strong defensive capabilities against the tested attack methods, which confirms the effectiveness of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document