scholarly journals MedicalGuard: U-Net Model Robust against Adversarially Perturbed Images

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Hyun Kwon

Deep neural networks perform well for image recognition, speech recognition, and pattern analysis. This type of neural network has also been used in the medical field, where it has displayed good performance in predicting or classifying patient diagnoses. An example is the U-Net model, which has demonstrated good performance in data segmentation, an important technology in the field of medical imaging. However, deep neural networks are vulnerable to adversarial examples. Adversarial examples are samples created by adding a small amount of noise to an original data sample in such a way that to human perception they appear to be normal data but they will be incorrectly classified by the classification model. Adversarial examples pose a significant threat in the medical field, as they can cause models to misidentify or misclassify patient diagnoses. In this paper, I propose an advanced adversarial training method to defend against such adversarial examples. An advantage of the proposed method is that it creates a wide variety of adversarial examples for use in training, which are generated by the fast gradient sign method (FGSM) for a range of epsilon values. A U-Net model trained on these diverse adversarial examples will be more robust to unknown adversarial examples. Experiments were conducted using the ISBI 2012 dataset, with TensorFlow as the machine learning library. According to the experimental results, the proposed method builds a model that demonstrates segmentation robustness against adversarial examples by reducing the pixel error between the original labels and the adversarial examples to an average of 1.45.

2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Guangling Sun ◽  
Yuying Su ◽  
Chuan Qin ◽  
Wenbo Xu ◽  
Xiaofeng Lu ◽  
...  

Although Deep Neural Networks (DNNs) have achieved great success on various applications, investigations have increasingly shown DNNs to be highly vulnerable when adversarial examples are used as input. Here, we present a comprehensive defense framework to protect DNNs against adversarial examples. First, we present statistical and minor alteration detectors to filter out adversarial examples contaminated by noticeable and unnoticeable perturbations, respectively. Then, we ensemble the detectors, a deep Residual Generative Network (ResGN), and an adversarially trained targeted network, to construct a complete defense framework. In this framework, the ResGN is our previously proposed network which is used to remove adversarial perturbations, and the adversarially trained targeted network is a network that is learned through adversarial training. Specifically, once the detectors determine an input example to be adversarial, it is cleaned by ResGN and then classified by the adversarially trained targeted network; otherwise, it is directly classified by this network. We empirically evaluate the proposed complete defense on ImageNet dataset. The results confirm the robustness against current representative attacking methods including fast gradient sign method, randomized fast gradient sign method, basic iterative method, universal adversarial perturbations, DeepFool method, and Carlini & Wagner method.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 428
Author(s):  
Hyun Kwon ◽  
Jun Lee

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.


2021 ◽  
Vol 15 ◽  
Author(s):  
Pengfei Xie ◽  
Shuhao Shi ◽  
Shuai Yang ◽  
Kai Qiao ◽  
Ningning Liang ◽  
...  

Deep neural networks (DNNs) are proven vulnerable to attack against adversarial examples. Black-box transfer attacks pose a massive threat to AI applications without accessing target models. At present, the most effective black-box attack methods mainly adopt data enhancement methods, such as input transformation. Previous data enhancement frameworks only work on input transformations that satisfy accuracy or loss invariance. However, it does not work for other transformations that do not meet the above conditions, such as the transformation which will lose information. To solve this problem, we propose a new noise data enhancement framework (NDEF), which only transforms adversarial perturbation to avoid the above issues effectively. In addition, we introduce random erasing under this framework to prevent the over-fitting of adversarial examples. Experimental results show that the black-box attack success rate of our method Random Erasing Iterative Fast Gradient Sign Method (REI-FGSM) is 4.2% higher than DI-FGSM in six models on average and 6.6% higher than DI-FGSM in three defense models. REI-FGSM can combine with other methods to achieve excellent performance. The attack performance of SI-FGSM can be improved by 22.9% on average when combined with REI-FGSM. Besides, our combined version with DI-TI-MI-FGSM, i.e., DI-TI-MI-REI-FGSM can achieve an average attack success rate of 97.0% against three ensemble adversarial training models, which is greater than the current gradient iterative attack method. We also introduce Gaussian blur to prove the compatibility of our framework.


2020 ◽  
Vol 117 (47) ◽  
pp. 29330-29337 ◽  
Author(s):  
Tal Golan ◽  
Prashant C. Raju ◽  
Nikolaus Kriegeskorte

Distinct scientific theories can make similar predictions. To adjudicate between theories, we must design experiments for which the theories make distinct predictions. Here we consider the problem of comparing deep neural networks as models of human visual recognition. To efficiently compare models’ ability to predict human responses, we synthesize controversial stimuli: images for which different models produce distinct responses. We applied this approach to two visual recognition tasks, handwritten digits (MNIST) and objects in small natural images (CIFAR-10). For each task, we synthesized controversial stimuli to maximize the disagreement among models which employed different architectures and recognition algorithms. Human subjects viewed hundreds of these stimuli, as well as natural examples, and judged the probability of presence of each digit/object category in each image. We quantified how accurately each model predicted the human judgments. The best-performing models were a generative analysis-by-synthesis model (based on variational autoencoders) for MNIST and a hybrid discriminative–generative joint energy model for CIFAR-10. These deep neural networks (DNNs), which model the distribution of images, performed better than purely discriminative DNNs, which learn only to map images to labels. None of the candidate models fully explained the human responses. Controversial stimuli generalize the concept of adversarial examples, obviating the need to assume a ground-truth model. Unlike natural images, controversial stimuli are not constrained to the stimulus distribution models are trained on, thus providing severe out-of-distribution tests that reveal the models’ inductive biases. Controversial stimuli therefore provide powerful probes of discrepancies between models and human perception.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Raheel Siddiqi

AbstractAn accurate and robust fruit image classifier can have a variety of real-life and industrial applications including automated pricing, intelligent sorting, and information extraction. This paper demonstrates how adversarial training can enhance the robustness of fruit image classifiers. In the past, research in deep-learning-based fruit image classification has focused solely on attaining the highest possible accuracy of the model used in the classification process. However, even the highest accuracy models are still susceptible to adversarial attacks which pose serious problems for such systems in practice. As a robust fruit classifier can only be developed with the aid of a fruit image dataset consisting of fruit images photographed in realistic settings (rather than images taken in controlled laboratory settings), a new dataset of over three thousand fruit images belonging to seven fruit classes is presented. Each image is carefully selected so that its classification poses a significant challenge for the proposed classifiers. Three Convolutional Neural Network (CNN)-based classifiers are suggested: 1) IndusNet, 2) fine-tuned VGG16, and 3) fine-tuned MobileNet. Fine-tuned VGG16 produced the best test set accuracy of 94.82% compared to the 92.32% and the 94.28% produced by the other two models, respectively. Fine-tuned MobileNet has proved to be the most efficient model with a test time of 9 ms/step compared to the test times of 28 ms/step and 29 ms/step for the other two models. The empirical evidence presented demonstrates that adversarial training enables fruit image classifiers to resist attacks crafted through the Fast Gradient Sign Method (FGSM), while simultaneously improving classifiers’ robustness against other noise forms including ‘Gaussian’, ‘Salt and pepper’ and ‘Speckle’. For example, when the amplitude of the perturbations generated through the Fast Gradient Sign Method (FGSM) was kept at 0.1, adversarial training improved the fine-tuned VGG16’s performance on adversarial images by around 18% (i.e., from 76.6% to 94.82%), while simultaneously improving the classifier’s performance on fruit images corrupted with ‘salt and pepper’ noise by around 8% (i.e., from 69.82% to 77.85%). Other reported results also follow this pattern and demonstrate the effectiveness of adversarial training as a means of enhancing the robustness of fruit image classifiers.


2021 ◽  
Vol 7 (2) ◽  
pp. 303-306
Author(s):  
Ning Ding ◽  
Knut Möller

Abstract Deep neural networks have shown effectiveness in many applications, however, in regulated applications like automotive or medicine, quality guarantees are required. Thus, it is important to understand the robustness of the solutions to perturbations in the input space. In order to identify the vulnerability of a trained classification model and evaluate the effect of different perturbations in the input on the output class, two different methods to generate adversarial examples were implemented. The adversarial images created were developed into a robustness index to monitor the training state and safety of a convolutional neural network model. In the future work, some generated adversarial images will be included into the training phase to improve the model robustness.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 268 ◽  
Author(s):  
Hokuto Hirano ◽  
Kazuhiro Takemoto

Deep neural networks (DNNs) are vulnerable to adversarial attacks. In particular, a single perturbation known as the universal adversarial perturbation (UAP) can foil most classification tasks conducted by DNNs. Thus, different methods for generating UAPs are required to fully evaluate the vulnerability of DNNs. A realistic evaluation would be with cases that consider targeted attacks; wherein the generated UAP causes the DNN to classify an input into a specific class. However, the development of UAPs for targeted attacks has largely fallen behind that of UAPs for non-targeted attacks. Therefore, we propose a simple iterative method to generate UAPs for targeted attacks. Our method combines the simple iterative method for generating non-targeted UAPs and the fast gradient sign method for generating a targeted adversarial perturbation for an input. We applied the proposed method to state-of-the-art DNN models for image classification and proved the existence of almost imperceptible UAPs for targeted attacks; further, we demonstrated that such UAPs can be easily generated.


Sign in / Sign up

Export Citation Format

Share Document