scholarly journals Absum: Simple Regularization Method for Reducing Structural Sensitivity of Convolutional Neural Networks

2020 ◽  
Vol 34 (04) ◽  
pp. 4394-4403
Author(s):  
Sekitoshi Kanai ◽  
Yasutoshi Ida ◽  
Yasuhiro Fujiwara ◽  
Masanori Yamada ◽  
Shuichi Adachi

We propose Absum, which is a regularization method for improving adversarial robustness of convolutional neural networks (CNNs). Although CNNs can accurately recognize images, recent studies have shown that the convolution operations in CNNs commonly have structural sensitivity to specific noise composed of Fourier basis functions. By exploiting this sensitivity, they proposed a simple black-box adversarial attack: Single Fourier attack. To reduce structural sensitivity, we can use regularization of convolution filter weights since the sensitivity of linear transform can be assessed by the norm of the weights. However, standard regularization methods can prevent minimization of the loss function because they impose a tight constraint for obtaining high robustness. To solve this problem, Absum imposes a loose constraint; it penalizes the absolute values of the summation of the parameters in the convolution layers. Absum can improve robustness against single Fourier attack while being as simple and efficient as standard regularization methods (e.g., weight decay and L1 regularization). Our experiments demonstrate that Absum improves robustness against single Fourier attack more than standard regularization methods. Furthermore, we reveal that robust CNNs with Absum are more robust against transferred attacks due to decreasing the common sensitivity and against high-frequency noise than standard regularization methods. We also reveal that Absum can improve robustness against gradient-based attacks (projected gradient descent) when used with adversarial training.

The objective of this research is provide to the specialists in skin cancer, a premature, rapid and non-invasive diagnosis of melanoma identification, using an image of the lesion, to apply to the treatment of a patient, the method used is the architecture contrast of Convolutional neural networks proposed by Laura Kocobinski of the University of Boston, against our architecture, which reduce the depth of the convolution filter of the last two convolutional layers to obtain maps of more significant characteristics. The performance of the model was reflected in the accuracy during the validation, considering the best result obtained, which is confirmed with the additional data set. The findings found with the application of this base architecture were improved accuracy from 0.79 to 0.83, with 30 epochs, compared to Kocobinski's AlexNet architecture, it was not possible to improve the accuracy of 0.90, however, the complexity of the network played an important role in the results we obtained, which was able to balance and obtain better results without increasing the epochs, the application of our research is very helpful for doctors, since it will allow them to quickly identify if an injury is melanoma or not and consequently treat it efficiently.


Author(s):  
Zhengsu Chen ◽  
Jianwei Niu ◽  
Xuefeng Liu ◽  
Shaojie Tang

Convolutional neural networks (CNNs) have achieved remarkable success in image recognition. Although the internal patterns of the input images are effectively learned by the CNNs, these patterns only constitute a small proportion of useful patterns contained in the input images. This can be attributed to the fact that the CNNs will stop learning if the learned patterns are enough to make a correct classification. Network regularization methods like dropout and SpatialDropout can ease this problem. During training, they randomly drop the features. These dropout methods, in essence, change the patterns learned by the networks, and in turn, forces the networks to learn other patterns to make the correct classification. However, the above methods have an important drawback. Randomly dropping features is generally inefficient and can introduce unnecessary noise. To tackle this problem, we propose SelectScale. Instead of randomly dropping units, SelectScale selects the important features in networks and adjusts them during training. Using SelectScale, we improve the performance of CNNs on CIFAR and ImageNet.


The objective of this research is provide to the specialists in skin cancer, a premature, rapid and non-invasive diagnosis of melanoma identification, using an image of the lesion, to apply to the treatment of a patient, the method used is the architecture contrast of Convolutional neural networks proposed by Laura Kocobinski of the University of Boston, against our architecture, which reduce the depth of the convolution filter of the last two convolutional layers to obtain maps of more significant characteristics. The performance of the model was reflected in the accuracy during the validation, considering the best result obtained, which is confirmed with the additional data set. The findings found with the application of this base architecture were improved accuracy from 0.79 to 0.83, with 30 epochs, compared to Kocobinski's AlexNet architecture, it was not possible to improve the accuracy of 0.90, however, the complexity of the network played an important role in the results we obtained, which was able to balance and obtain better results without increasing the epochs, the application of our research is very helpful for doctors, since it will allow them to quickly identify if an injury is melanoma or not and consequently treat it efficiently.


Author(s):  
M. Straat ◽  
F. Abadi ◽  
Z. Kan ◽  
C. Göpfert ◽  
B. Hammer ◽  
...  

AbstractWe present a modelling framework for the investigation of supervised learning in non-stationary environments. Specifically, we model two example types of learning systems: prototype-based learning vector quantization (LVQ) for classification and shallow, layered neural networks for regression tasks. We investigate so-called student–teacher scenarios in which the systems are trained from a stream of high-dimensional, labeled data. Properties of the target task are considered to be non-stationary due to drift processes while the training is performed. Different types of concept drift are studied, which affect the density of example inputs only, the target rule itself, or both. By applying methods from statistical physics, we develop a modelling framework for the mathematical analysis of the training dynamics in non-stationary environments. Our results show that standard LVQ algorithms are already suitable for the training in non-stationary environments to a certain extent. However, the application of weight decay as an explicit mechanism of forgetting does not improve the performance under the considered drift processes. Furthermore, we investigate gradient-based training of layered neural networks with sigmoidal activation functions and compare with the use of rectified linear units. Our findings show that the sensitivity to concept drift and the effectiveness of weight decay differs significantly between the two types of activation function.


Symmetry ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 154
Author(s):  
Yuan Bao ◽  
Zhaobin Liu ◽  
Zhongxuan Luo ◽  
Sibo Yang

In this paper, a novel smooth group L1/2 (SGL1/2) regularization method is proposed for pruning hidden nodes of the fully connected layer in convolution neural networks. Usually, the selection of nodes and weights is based on experience, and the convolution filter is symmetric in the convolution neural network. The main contribution of SGL1/2 is to try to approximate the weights to 0 at the group level. Therefore, we will be able to prune the hidden node if the corresponding weights are all close to 0. Furthermore, the feasibility analysis of this new method is carried out under some reasonable assumptions due to the smooth function. The numerical results demonstrate the superiority of the SGL1/2 method with respect to sparsity, without damaging the classification performance.


2020 ◽  
Vol 34 (10) ◽  
pp. 13943-13944
Author(s):  
Kira Vinogradova ◽  
Alexandr Dibrov ◽  
Gene Myers

Convolutional neural networks have become state-of-the-art in a wide range of image recognition tasks. The interpretation of their predictions, however, is an active area of research. Whereas various interpretation methods have been suggested for image classification, the interpretation of image segmentation still remains largely unexplored. To that end, we propose seg-grad-cam, a gradient-based method for interpreting semantic segmentation. Our method is an extension of the widely-used Grad-CAM method, applied locally to produce heatmaps showing the relevance of individual pixels for semantic segmentation.


Sign in / Sign up

Export Citation Format

Share Document