Optimizing nonlinear activation function for convolutional neural networks

Author(s):  
Munender Varshney ◽  
Pravendra Singh
1999 ◽  
Vol 11 (5) ◽  
pp. 1069-1077 ◽  
Author(s):  
Danilo P. Mandic ◽  
Jonathon A. Chambers

A relationship between the learning rate η in the learning algorithm, and the slope β in the nonlinear activation function, for a class of recurrent neural networks (RNNs) trained by the real-time recurrent learning algorithm is provided. It is shown that an arbitrary RNN can be obtained via the referent RNN, with some deterministic rules imposed on its weights and the learning rate. Such relationships reduce the number of degrees of freedom when solving the nonlinear optimization task of finding the optimal RNN parameters.


2020 ◽  
Vol 34 (04) ◽  
pp. 6030-6037
Author(s):  
MohamadAli Torkamani ◽  
Shiv Shankar ◽  
Amirmohammad Rooshenas ◽  
Phillip Wallis

Most deep neural networks use simple, fixed activation functions, such as sigmoids or rectified linear units, regardless of domain or network structure. We introduce differential equation units (DEUs), an improvement to modern neural networks, which enables each neuron to learn a particular nonlinear activation function from a family of solutions to an ordinary differential equation. Specifically, each neuron may change its functional form during training based on the behavior of the other parts of the network. We show that using neurons with DEU activation functions results in a more compact network capable of achieving comparable, if not superior, performance when compared to much larger networks.


2018 ◽  
Vol 8 (12) ◽  
pp. 3851 ◽  
Author(s):  
Mario Miscuglio ◽  
Armin Mehrabian ◽  
Zibo Hu ◽  
Shaimaa I. Azzam ◽  
Jonathan George ◽  
...  

Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 2004
Author(s):  
Yuna Han ◽  
Byung-Woo Hong

In recent years, convolutional neural networks have been studied in the Fourier domain for a limited environment, where competitive results can be expected for conventional image classification tasks in the spatial domain. We present a novel efficient Fourier convolutional neural network, where a new activation function is used, the additional shift Fourier transformation process is eliminated, and the number of learnable parameters is reduced. First, the Phase Rectified Linear Unit (PhaseReLU) is proposed, which is equivalent to the Rectified Linear Unit (ReLU) in the spatial domain. Second, in the proposed Fourier network, the shift Fourier transform is removed since the process is inessential for training. Lastly, we introduce two ways of reducing the number of weight parameters in the Fourier network. The basic method is to use a three-by-three sized kernel instead of five-by-five in our proposed Fourier convolutional neural network. We use the random kernel in our efficient Fourier convolutional neural network, whose standard deviation of the Gaussian distribution is used as a weight parameter. In other words, since only two scalars for each imaginary and real component per channel are required, a very small number of parameters is applied compressively. Therefore, as a result of experimenting in shallow networks, such as LeNet-3 and LeNet-5, our method achieves competitive accuracy with conventional convolutional neural networks while dramatically reducing the number of parameters. Furthermore, our proposed Fourier network, using a basic three-by-three kernel, mostly performs with higher accuracy than traditional convolutional neural networks in shallow and deep neural networks. Our experiments represent that presented kernel methods have the potential to be applied in all architecture based on convolutional neural networks.


2019 ◽  
Vol 44 (3) ◽  
pp. 303-330 ◽  
Author(s):  
Shallu Sharma ◽  
Rajesh Mehra

Abstract Convolutional neural networks (CNN) is a contemporary technique for computer vision applications, where pooling implies as an integral part of the deep CNN. Besides, pooling provides the ability to learn invariant features and also acts as a regularizer to further reduce the problem of overfitting. Additionally, the pooling techniques significantly reduce the computational cost and training time of networks which are equally important to consider. Here, the performances of pooling strategies on different datasets are analyzed and discussed qualitatively. This study presents a detailed review of the conventional and the latest strategies which would help in appraising the readers with the upsides and downsides of each strategy. Also, we have identified four fundamental factors namely network architecture, activation function, overlapping and regularization approaches which immensely affect the performance of pooling operations. It is believed that this work would help in extending the scope of understanding the significance of CNN along with pooling regimes for solving computer vision problems.


Sign in / Sign up

Export Citation Format

Share Document