scholarly journals Subclass Deep Neural Networks: Re-enabling Neglected Classes in Deep Network Training for Multimedia Classification

Author(s):  
Nikolaos Gkalelis ◽  
Vasileios Mezaris
2021 ◽  
Vol 13 (2) ◽  
pp. 36-40
Author(s):  
A. Smorodin

The article investigated a modification of stochastic gradient descent (SGD), based on the previously developed stabilization theory of discrete dynamical system cycles. Relation between stabilization of cycles in discrete dynamical systems and finding extremum points allowed us to apply new control methods to accelerate gradient descent when approaching local minima. Gradient descent is often used in training deep neural networks on a par with other iterative methods.  Two gradient SGD and Adam were experimented, and we conducted comparative experiments.  All experiments were conducted during solving a practical problem of teeth recognition on 2-D panoramic images. Network training showed that the new method outperforms the SGD in its capabilities and as for parameters chosen it approaches the capabilities of Adam, which is a “state of the art” method. Thus, practical utility of using control theory in the training of deep neural networks and possibility of expanding its applicability in the process of creating new algorithms in this important field are shown.


2020 ◽  
Vol 34 (04) ◽  
pp. 5784-5791
Author(s):  
Sungho Shin ◽  
Jinhwan Park ◽  
Yoonho Boo ◽  
Wonyong Sung

Quantization of deep neural networks is extremely essential for efficient implementations. Low-precision networks are typically designed to represent original floating-point counterparts with high fidelity, and several elaborate quantization algorithms have been developed. We propose a novel training scheme for quantized neural networks to reach flat minima in the loss surface with the aid of quantization noise. The proposed training scheme employs high-low-high-low precision in an alternating manner for network training. The learning rate is also abruptly changed at each stage for coarse- or fine-tuning. With the proposed training technique, we show quite good performance improvements for convolutional neural networks when compared to the previous fine-tuning based quantization scheme. We achieve the state-of-the-art results for recurrent neural network based language modeling with 2-bit weight and activation.


Author(s):  
Joshua C. Peterson ◽  
Joshua T. Abbott ◽  
Thomas L. Griffiths

Deep neural networks have become increasingly successful at solving classic perception problems (e.g., recognizing objects), often reaching or surpassing human-level accuracy. In this abridged report of Peterson et al. [2016], we examine the relationship between the image representations learned by these networks and those of humans. We find that deep features learned in service of object classification account for a significant amount of the variance in human similarity judgments for a set of animal images. However, these features do not appear to capture some key qualitative aspects of human representations. To close this gap, we present a method for adapting deep features to align with human similarity judgments, resulting in image representations that can potentially be used to extend the scope of psychological experiments and inform human-centric AI.


Author(s):  
Yasufumi Sakai ◽  
Yutaka Tamiya

AbstractRecent advances in deep neural networks have achieved higher accuracy with more complex models. Nevertheless, they require much longer training time. To reduce the training time, training methods using quantized weight, activation, and gradient have been proposed. Neural network calculation by integer format improves the energy efficiency of hardware for deep learning models. Therefore, training methods for deep neural networks with fixed point format have been proposed. However, the narrow data representation range of the fixed point format degrades neural network accuracy. In this work, we propose a new fixed point format named shifted dynamic fixed point (S-DFP) to prevent accuracy degradation in quantized neural networks training. S-DFP can change the data representation range of dynamic fixed point format by adding bias to the exponent. We evaluated the effectiveness of S-DFP for quantized neural network training on the ImageNet task using ResNet-34, ResNet-50, ResNet-101 and ResNet-152. For example, the accuracy of quantized ResNet-152 is improved from 76.6% with conventional 8-bit DFP to 77.6% with 8-bit S-DFP.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Stephen Whitelam ◽  
Viktor Selin ◽  
Sang-Won Park ◽  
Isaac Tamblyn

AbstractWe show analytically that training a neural network by conditioned stochastic mutation or neuroevolution of its weights is equivalent, in the limit of small mutations, to gradient descent on the loss function in the presence of Gaussian white noise. Averaged over independent realizations of the learning process, neuroevolution is equivalent to gradient descent on the loss function. We use numerical simulation to show that this correspondence can be observed for finite mutations, for shallow and deep neural networks. Our results provide a connection between two families of neural-network training methods that are usually considered to be fundamentally different.


Author(s):  
Hengjie Chen ◽  
Zhong Li

By applying fundamental mathematical knowledge, this paper proves that the function [Formula: see text] is an integer no less than [Formula: see text] has the property that the difference between the function value of middle point of arbitrarily two adjacent equidistant distribution nodes on [Formula: see text] and the mean of function values of these two nodes is a constant depending only on the number of nodes if and only if [Formula: see text] By them, we establish an important result about deep neural networks that the function [Formula: see text] can be interpolated by a deep Rectified Linear Unit (ReLU) network with depth [Formula: see text] on the equidistant distribution nodes in interval [Formula: see text] and the error of approximation is [Formula: see text] Then based on the main result that has just been proven and the Chebyshev orthogonal polynomials, we construct a deep network and give the error estimate of approximation to polynomials and continuous functions, respectively. In addition, this paper constructs one deep network with local sparse connections, shared weights and activation function [Formula: see text] and discusses its density and complexity.


2020 ◽  
Vol 34 (04) ◽  
pp. 6013-6020
Author(s):  
Kai Tian ◽  
Yi Xu ◽  
Jihong Guan ◽  
Shuigeng Zhou

Despite powerful representation ability, deep neural networks (DNNs) are prone to over-fitting, because of over-parametrization. Existing works have explored various regularization techniques to tackle the over-fitting problem. Some of them employed soft targets rather than one-hot labels to guide network training (e.g. label smoothing in classification tasks), which are called target-based regularization approaches in this paper. To alleviate the over-fitting problem, here we propose a new and general regularization framework that introduces an auxiliary network to dynamically incorporate guided semantic disturbance to the labels. We call it Network as Regularization (NaR in short). During training, the disturbance is constructed by a convex combination of the predictions of the target network and the auxiliary network. These two networks are initialized separately. And the auxiliary network is trained independently from the target network, while providing instance-level and class-level semantic information to the latter progressively. We conduct extensive experiments to validate the effectiveness of the proposed method. Experimental results show that NaR outperforms many state-of-the-art target-based regularization methods, and other regularization approaches (e.g. mixup) can also benefit from combining with NaR.


Author(s):  
A.P. Karpenko ◽  
V.A. Ovchinnikov

The study aims to develop an algorithm and then software to synthesise noise that could be used to attack deep learning neural networks designed to classify images. We present the results of our analysis of methods for conducting this type of attacks. The synthesis of attack noise is stated as a problem of multidimensional constrained optimization. The main features of the attack noise synthesis algorithm proposed are as follows: we employ the clip function to take constraints on noise into account; we use the top-1 and top-5 classification error ratings as attack noise efficiency criteria; we train our neural networks using backpropagation and Adam's gradient descent algorithm; stochastic gradient descent is employed to solve the optimisation problem indicated above; neural network training also makes use of the augmentation technique. The software was developed in Python using the Pytorch framework to dynamically differentiate the calculation graph and runs under Ubuntu 18.04 and CentOS 7. Our IDE was Visual Studio Code. We accelerated the computation via CUDA executed on a NVIDIA Titan XP GPU. The paper presents the results of a broad computational experiment in synthesising non-universal and universal attack noise types for eight deep neural networks. We show that the attack algorithm proposed is able to increase the neural network error by eight times


Author(s):  
Shipeng Wang ◽  
Jian Sun ◽  
Zongben Xu

Deep neural networks are traditionally trained using humandesigned stochastic optimization algorithms, such as SGD and Adam. Recently, the approach of learning to optimize network parameters has emerged as a promising research topic. However, these learned black-box optimizers sometimes do not fully utilize the experience in human-designed optimizers, therefore have limitation in generalization ability. In this paper, a new optimizer, dubbed as HyperAdam, is proposed that combines the idea of “learning to optimize” and traditional Adam optimizer. Given a network for training, its parameter update in each iteration generated by HyperAdam is an adaptive combination of multiple updates generated by Adam with varying decay rates . The combination weights and decay rates in HyperAdam are adaptively learned depending on the task. HyperAdam is modeled as a recurrent neural network with AdamCell, WeightCell and StateCell. It is justified to be state-of-the-art for various network training, such as multilayer perceptron, CNN and LSTM.


Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 4977
Author(s):  
Ji-Won Kang ◽  
Jae-Eun Lee ◽  
Jang-Hwan Choi ◽  
Woosuk Kim ◽  
Jin-Kyum Kim ◽  
...  

This paper proposes a method to embed and extract a watermark on a digital hologram using a deep neural network. The entire algorithm for watermarking digital holograms consists of three sub-networks. For the robustness of watermarking, an attack simulation is inserted inside the deep neural network. By including attack simulation and holographic reconstruction in the network, the deep neural network for watermarking can simultaneously train invisibility and robustness. We propose a network training method using hologram and reconstruction. After training the proposed network, we analyze the robustness of each attack and perform re-training according to this result to propose a method to improve the robustness. We quantitatively evaluate the results of robustness against various attacks and show the reliability of the proposed technique.


Sign in / Sign up

Export Citation Format

Share Document