Algorithm Research on Improving Activation Function of Convolutional Neural Networks

Author(s):  
Yanhua Guo ◽  
Lei Sun ◽  
Zhihong Zhang ◽  
Hong He
Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 2004
Author(s):  
Yuna Han ◽  
Byung-Woo Hong

In recent years, convolutional neural networks have been studied in the Fourier domain for a limited environment, where competitive results can be expected for conventional image classification tasks in the spatial domain. We present a novel efficient Fourier convolutional neural network, where a new activation function is used, the additional shift Fourier transformation process is eliminated, and the number of learnable parameters is reduced. First, the Phase Rectified Linear Unit (PhaseReLU) is proposed, which is equivalent to the Rectified Linear Unit (ReLU) in the spatial domain. Second, in the proposed Fourier network, the shift Fourier transform is removed since the process is inessential for training. Lastly, we introduce two ways of reducing the number of weight parameters in the Fourier network. The basic method is to use a three-by-three sized kernel instead of five-by-five in our proposed Fourier convolutional neural network. We use the random kernel in our efficient Fourier convolutional neural network, whose standard deviation of the Gaussian distribution is used as a weight parameter. In other words, since only two scalars for each imaginary and real component per channel are required, a very small number of parameters is applied compressively. Therefore, as a result of experimenting in shallow networks, such as LeNet-3 and LeNet-5, our method achieves competitive accuracy with conventional convolutional neural networks while dramatically reducing the number of parameters. Furthermore, our proposed Fourier network, using a basic three-by-three kernel, mostly performs with higher accuracy than traditional convolutional neural networks in shallow and deep neural networks. Our experiments represent that presented kernel methods have the potential to be applied in all architecture based on convolutional neural networks.


2019 ◽  
Vol 44 (3) ◽  
pp. 303-330 ◽  
Author(s):  
Shallu Sharma ◽  
Rajesh Mehra

Abstract Convolutional neural networks (CNN) is a contemporary technique for computer vision applications, where pooling implies as an integral part of the deep CNN. Besides, pooling provides the ability to learn invariant features and also acts as a regularizer to further reduce the problem of overfitting. Additionally, the pooling techniques significantly reduce the computational cost and training time of networks which are equally important to consider. Here, the performances of pooling strategies on different datasets are analyzed and discussed qualitatively. This study presents a detailed review of the conventional and the latest strategies which would help in appraising the readers with the upsides and downsides of each strategy. Also, we have identified four fundamental factors namely network architecture, activation function, overlapping and regularization approaches which immensely affect the performance of pooling operations. It is believed that this work would help in extending the scope of understanding the significance of CNN along with pooling regimes for solving computer vision problems.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yao Ying ◽  
Nengbo Zhang ◽  
Ping He ◽  
Silong Peng

The activation function is the basic component of the convolutional neural network (CNN), which provides the nonlinear transformation capability required by the network. Many activation functions make the original input compete with different linear or nonlinear mapping terms to obtain different nonlinear transformation capabilities. Until recently, the original input of funnel activation (FReLU) competed with the spatial conditions, so FReLU not only has the ability of nonlinear transformation but also has the ability of pixelwise modeling. We summarize the competition mechanism in the activation function and then propose a novel activation function design template: competitive activation function (CAF), which promotes competition among different elements. CAF generalizes all activation functions that use competition mechanisms. According to CAF, we propose a parametric funnel rectified exponential unit (PFREU). PFREU promotes competition among linear mapping, nonlinear mapping, and spatial conditions. We conduct experiments on four datasets of different sizes, and the experimental results of three classical convolutional neural networks proved the superiority of our method.


Sensors ◽  
2020 ◽  
Vol 20 (14) ◽  
pp. 3837 ◽  
Author(s):  
Jingli Yang ◽  
Shuangyan Yin ◽  
Yongqi Chang ◽  
Tianyu Gao

Aiming at the fault diagnosis issue of rotating machinery, a novel method based on the deep learning theory is presented in this paper. By combining one-dimensional convolutional neural networks (1D-CNN) with self-normalizing neural networks (SNN), the proposed method can achieve high fault identification accuracy in a simple and compact architecture configuration. By taking advantage of the self-normalizing properties of the activation function SeLU, the stability and convergence of the fault diagnosis model are maintained. By introducing α -dropout mechanism twice to regularize the training process, the overfitting problem is resolved and the generalization capability of the model is further improved. The experimental results on the benchmark dataset show that the proposed method possesses high fault identification accuracy and excellent cross-load fault diagnosis capability.


Author(s):  
R. Nithya

The main objective of this paper is to train the image classifier using Convolutional Neural Networks with tensorflow architecture. The proposed paper focus on systematic approaches in classifying the sample set of images using Convolutional Neural Networks. The CNN model with activation function thus classifies the dataset into two categories exactly like human. Thus, the paper highlights the importance of augmentation by comparing their accuracies.


There is an evident paradigm shift in steganalysis techniques with discovery of deep learning networks. As steganalysis is a classification task, it is done by machine learning classifiers and ensembles of them. But with the proliferation of deep learning and Convolutional Neural Networks in many areas, the performance of steganalysis techniques have jumped up to a another high, because of the application of Convolutional Neural Networks. The traditional steganalysis techniques consists two important steps, i.e., feature extraction and classification; where as deep learning networks learn the features automatically, eliminating the need of extraction of handcrafted features. Because of this feature CNNs were highly successful in image recognition and image classification techniques. In addition to that, feature extraction and classification are combined together in deep learning hence classification would be more effective because of the learning of the features which are really important for classification. But in Steganalysis the task is to detect very subtle and weak noise created by the hidden data with steganography techniques. We have designed a deep CNN architecture customized for steganalysis task based on existing residual neural networks frame. We have introduced a descriptor to capture the inter pixel dependencies and which acts as an indicator for weightage of a particular feature maps. Thus the classifier can give more weightage to effective feature maps instead of treating all the feature maps equally. We have also used a gating mechanism by using sigmoid function after nonlinear activation function sandwiched between two fully connected layers. This enhancement to the existing deep residual neural networks has given better results in terms of error detection rate compared to the other deep learning based steganalysis techniques.


Sign in / Sign up

Export Citation Format

Share Document