An effective SteinGLM initialization scheme for training multi-layer feedforward sigmoidal neural networks

2021 ◽  
Author(s):  
Zebin Yang ◽  
Hengtao Zhang ◽  
Agus Sudjianto ◽  
Aijun Zhang
2020 ◽  
Vol 34 (04) ◽  
pp. 3914-3921
Author(s):  
Hongyang Gao ◽  
Lei Cai ◽  
Shuiwang Ji

Rectified linear units (ReLUs) are currently the most popular activation function used in neural networks. Although ReLUs can solve the gradient vanishing problem and accelerate training convergence, it suffers from the dying ReLU problem in which some neurons are never activated if the weights are not updated properly. In this work, we propose a novel activation function, known as the adaptive convolutional ReLU (ConvReLU), that can better mimic brain neuron activation behaviors and overcome the dying ReLU problem. With our novel parameter sharing scheme, ConvReLUs can be applied to convolution layers that allow each input neuron to be activated by different trainable thresholds without involving a large number of extra parameters. We employ the zero initialization scheme in ConvReLU to encourage trainable thresholds to be close to zero. Finally, we develop a partial replacement strategy that only replaces the ReLUs in the early layers of the network. This resolves the dying ReLU problem and retains sparse representations for linear classifiers. Experimental results demonstrate that our proposed ConvReLU has consistently better performance compared to ReLU, LeakyReLU, and PReLU. In addition, the partial replacement strategy is shown to be effective not only for our ConvReLU but also for LeakyReLU and PReLU.


2020 ◽  
Vol 17 (1) ◽  
pp. 418-424
Author(s):  
Sarfaraz Masoood ◽  
M. N. Doja ◽  
Pravin Chandra

Weight initialization of sigmoidal feed forward artificial neural network (SFFANN) and the Convolutional neural networks (CNN) has been a known factor which affects the learning abilities of the neural network. The uniform random weight initialization approach has been quite often used as the conventional network weight initial technique, due to its simplicity. However, various researches have shown that the random technique may not be the ideal choice of weight initialization for these neural networks. In this work, we analyze two separate chaotic functions and explore the possibilities of these being used as the weight initialization methods against the conventional random initialization technique for SFANNs as well as for the CNNs. For the SFFANNs, this analysis were done over 8 function approximation problems chosen for experimentation. The mean test error values along with a two sample t-test results strongly suggest that the Chebyshev chaotic map based weight initialization technique outperforms the conventional random initialization technique for most of the problems under consideration and hence may be used as an alternative weight initialization technique for the SFFANNs. For the CNN experiment, the MNIST dataset was used for analyzing the performance of the random and the Chebyshev based initialization scheme. Results strongly support the use of the Chebyshev chaotic map based initialization scheme as an alternate to the conventional random initialization.


Sign in / Sign up

Export Citation Format

Share Document