Weighted sigmoid gate unit for an activation function of deep neural network

Deep Neural Network (DNN) stands for multilayered Neural Network (NN) that is capable of progressively learn the more abstract and composite representations of the raw features of the input data received, with no need for any feature engineering. They are advanced NNs having repetitious hidden layers between the initial input and the final layer. The working principle of such a standard deep classifier is based on a hierarchy formed by the composition of linear functions and a defined nonlinear Activation Function (AF). It remains uncertain (not clear) how the DNN classifier can function so well. But it is clear from many studies that within DNN, the AF choice has a notable impact on the kinetics of training and the success of tasks. In the past few years, different AFs have been formulated. The choice of AF is still an area of active study. Hence, in this study, a novel deep Feed forward NN model with four AFs has been proposed for breast cancer classification: hidden layer 1: Swish, hidden layer, 2:-LeakyReLU, hidden layer 3: ReLU, and final output layer: naturally Sigmoidal. The purpose of the study is twofold. Firstly, this study is a step toward a more profound understanding of DNN with layer-wise different AFs. Secondly, research is also aimed to explore better DNN-based systems to build predictive models for breast cancer data with improved accuracy. Therefore, the benchmark UCI dataset WDBC was used for the validation of the framework and evaluated using a ten-fold CV method and various performance indicators. Multiple simulations and outcomes of the experimentations have shown that the proposed solution performs in a better way than the Sigmoid, ReLU, and LeakyReLU and Swish activation DNN in terms of different parameters. This analysis contributes to producing an expert and precise clinical dataset classification method for breast cancer. Furthermore, the model also achieved improved performance compared to many established state-of-the-art algorithms/models.

Download Full-text

A New Activation Function for Deep Neural Network

2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon) ◽

10.1109/comitcon.2019.8862253 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ochin Sharma

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Activation Function

Download Full-text

Nonlinear All-Optical Diffractive Deep Neural Network with 10.6 μm Wavelength for Image Classification

International Journal of Optics ◽

10.1155/2021/6667495 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Yichen Sun ◽

Mingli Dong ◽

Mingxin Yu ◽

Jiabin Xia ◽

Xu Zhang ◽

...

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Image Classification ◽

Deep Neural Network ◽

Activation Function ◽

Optical Nonlinearity ◽

Optical Diffraction ◽

Model Based ◽

All Optical ◽

Optical Neural Network

A photonic artificial intelligence chip is based on an optical neural network (ONN), low power consumption, low delay, and strong antiinterference ability. The all-optical diffractive deep neural network has recently demonstrated its inference capabilities on the image classification task. However, the size of the physical model does not have miniaturization and integration, and the optical nonlinearity is not incorporated into the diffraction neural network. By introducing the nonlinear characteristics of the network, complex tasks can be completed with high accuracy. In this study, a nonlinear all-optical diffraction deep neural network (N-D2NN) model based on 10.6 μm wavelength is constructed by combining the ONN and complex-valued neural networks with the nonlinear activation function introduced into the structure. To be specific, the improved activation function of the rectified linear unit (ReLU), i.e., Leaky-ReLU, parametric ReLU (PReLU), and randomized ReLU (RReLU), is selected as the activation function of the N-D2NN model. Through numerical simulation, it is proved that the N-D2NN model based on 10.6 μm wavelength has excellent representation ability, which enables them to perform classification learning tasks of the MNIST handwritten digital dataset and Fashion-MNIST dataset well, respectively. The results show that the N-D2NN model with the RReLU activation function has the highest classification accuracy of 97.86% and 89.28%, respectively. These results provide a theoretical basis for the preparation of miniaturized and integrated N-D2NN model photonic artificial intelligence chips.

Download Full-text

Keys for Action: An Efficient Keyframe-Based Approach for 3D Action Recognition Using a Deep Neural Network

Sensors ◽

10.3390/s20082226 ◽

2020 ◽

Vol 20 (8) ◽

pp. 2226

Author(s):

Hashim Yasin ◽

Mazhar Hussain ◽

Andreas Weber

Keyword(s):

Neural Network ◽

Action Recognition ◽

Deep Neural Network ◽

State Of The Art ◽

Activation Function ◽

Action Model ◽

Motion Sequence ◽

Orientation Information ◽

Original Motion ◽

Sigmoid Activation Function

In this paper, we propose a novel and efficient framework for 3D action recognition using a deep learning architecture. First, we develop a 3D normalized pose space that consists of only 3D normalized poses, which are generated by discarding translation and orientation information. From these poses, we extract joint features and employ them further in a Deep Neural Network (DNN) in order to learn the action model. The architecture of our DNN consists of two hidden layers with the sigmoid activation function and an output layer with the softmax function. Furthermore, we propose a keyframe extraction methodology through which, from a motion sequence of 3D frames, we efficiently extract the keyframes that contribute substantially to the performance of the action. In this way, we eliminate redundant frames and reduce the length of the motion. More precisely, we ultimately summarize the motion sequence, while preserving the original motion semantics. We only consider the remaining essential informative frames in the process of action recognition, and the proposed pipeline is sufficiently fast and robust as a result. Finally, we evaluate our proposed framework intensively on publicly available benchmark Motion Capture (MoCap) datasets, namely HDM05 and CMU. From our experiments, we reveal that our proposed scheme significantly outperforms other state-of-the-art approaches.

Download Full-text

Learning algorithm analysis for deep neural network with ReLu activation functions

ITM Web of Conferences ◽

10.1051/itmconf/20181901009 ◽

2018 ◽

Vol 19 ◽

pp. 01009

Author(s):

Stanisław Płaczek ◽

Aleksander Płaczek

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Learning Algorithm ◽

Activation Function ◽

Point Of View ◽

Algorithm Analysis ◽

Theoretic Point ◽

Neural Network Structure ◽

Null Value ◽

Hidden Layer

In the article, emphasis is put on the modern artificial neural network structure, which in the literature is known as a deep neural network. Network includes more than one hidden layer and comprises many standard modules with ReLu nonlinear activation function. A learning algorithm includes two standard steps, forward and backward, and its effectiveness depends on the way the learning error is transported back through all the layers to the first layer. Taking into account all the dimensionalities of matrixes and the nonlinear characteristics of ReLu activation function, the problem is very difficult from a theoretic point of view. To implement simple assumptions in the analysis, formal formulas are used to describe relations between the structure of every layer and the internal input vector. In practice tasks, neural networks’ internal layer matrixes with ReLu activations function, include a lot of null value of weight coefficients. This phenomenon has a negatives impact on the effectiveness of the learning algorithm convergences. A theoretical analysis could help to build more effective algorithms.

Download Full-text

A novel intervention method for aspect-based emotion Using Exponential Linear Unit (ELU) activation function in a Deep Neural Network

2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS) ◽

10.1109/iciccs51141.2021.9432223 ◽

2021 ◽

Author(s):

Devi T. ◽

N. Deepa

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Activation Function ◽

Intervention Method

Download Full-text

Efficient Quantization for Neural Networks with Binary Weights and Low Bitwidth Activations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013854 ◽

2019 ◽

Vol 33 ◽

pp. 3854-3861

Author(s):

Kun Huang ◽

Bingbing Ni ◽

Xiaokang Yang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Network ◽

Activation Function ◽

Limited Resources ◽

Portable Devices ◽

Training Procedure ◽

Activation Functions ◽

First Time ◽

And Training

Quantization has shown stunning efficiency on deep neural network, especially for portable devices with limited resources. Most existing works uncritically extend weight quantization methods to activations. However, we take the view that best performance can be obtained by applying different quantization methods to weights and activations respectively. In this paper, we design a new activation function dubbed CReLU from the quantization perspective and further complement this design with appropriate initialization method and training procedure. Moreover, we develop a specific quantization strategy in which we formulate the forward and backward approximation of weights with binary values and quantize the activations to low bitwdth using linear or logarithmic quantizer. We show, for the first time, our final quantized model with binary weights and ultra low bitwidth activations outperforms the previous best models by large margins on ImageNet as well as achieving nearly a 10.85× theoretical speedup with ResNet-18. Furthermore, ablation experiments and theoretical analysis demonstrate the effectiveness and robustness of CReLU in comparison with other activation functions.

Download Full-text

Dense Tissue Pattern Characterization Using Deep Neural Network

Cognitive Computation ◽

10.1007/s12559-021-09970-2 ◽

2022 ◽

Author(s):

Indrajeet Kumar ◽

Abhishek Kumar ◽

V D Ambeth Kumar ◽

Ramani Kannan ◽

Vrince Vimal ◽

...

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Data Augmentation ◽

Breast Tumors ◽

Activation Function ◽

Kappa Coefficient ◽

Proposed Model ◽

Proper Training ◽

Kappa Value ◽

Dense Tissue

AbstractBreast tumors are from the common infections among women around the world. Classifying the various types of breast tumors contribute to treating breast tumors more efficiently. However, this classification task is often hindered by dense tissue patterns captured in mammograms. The present study has been proposed a dense tissue pattern characterization framework using deep neural network. A total of 322 mammograms belonging to the mini-MIAS dataset and 4880 mammograms from DDSM dataset have been taken, and an ROI of fixed size 224 × 224 pixels from each mammogram has been extracted. In this work, tedious experimentation has been executed using different combinations of training and testing sets using different activation function with AlexNet, ResNet-18 model. Data augmentation has been used to create a similar type of virtual image for proper training of the DL model. After that, the testing set is applied on the trained model to validate the proposed model. During experiments, four different activation functions ‘sigmoid’, ‘tanh’, ‘ReLu’, and ‘leakyReLu’ are used, and the outcome for each function has been reported. It has been found that activation function ‘ReLu’ perform always outstanding with respect to others. For each experiment, classification accuracy and kappa coefficient have been computed. The obtained accuracy and kappa value for MIAS dataset using ResNet-18 model is 91.3% and 0.803, respectively. For DDSM dataset, the accuracy of 92.3% and kappa coefficient value of 0.846 are achieved. After the combination of both dataset images, the achieved accuracy is 91.9%, and kappa coefficient value is 0.839 using ResNet-18 model. Finally, it has been concluded that the ResNet-18 model and ReLu activation function yield outstanding performance for the task.

Download Full-text

Review of Adaptive Activation Function in Deep Neural Network

2018 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES) ◽

10.1109/iecbes.2018.8626714 ◽

2018 ◽

Cited By ~ 6

Author(s):

Mian Mian Lau ◽

King Hann Lim

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Activation Function

Download Full-text

Flatness Prediction of Cold Rolled Strip Based on Deep Neural Network with Improved Activation Function

Sensors ◽

10.3390/s22020656 ◽

2022 ◽

Vol 22 (2) ◽

pp. 656

Author(s):

Jingyi Liu ◽

Shuni Song ◽

Jiayi Wang ◽

Maimutimin Balaiti ◽

Nina Song ◽

...

Keyword(s):

Neural Network ◽

Loss Function ◽

Deep Neural Network ◽

Activation Function ◽

Rolled Strip ◽

Linear Network ◽

Nonlinear Network ◽

Shape Prediction ◽

Cold Rolled

With the improvement of industrial requirements for the quality of cold rolled strips, flatness has become one of the most important indicators for measuring the quality of cold rolled strips. In this paper, the strip production data of a 1250 mm tandem cold mill in a steel plant is modeled by an improved deep neural network (the improved DNN) to improve the accuracy of strip shape prediction. Firstly, the type of activation function is analyzed, and the monotonicity of the activation function is deemed independent of the convexity of the loss function in the deep network. Regardless of whether the activation function is monotonic, the loss function is not strictly convex. Secondly, the non-convex optimization of the loss functionextended from the deep linear network to the deep nonlinear network, is discussed, and the critical point of the deep nonlinear network is identified as the global minimum point. Finally, an improved Swish activation function based on batch normalization is proposed, and its performance is evaluated on the MNIST dataset. The experimental results show that the loss of an improved Swish function is lower than that of other activation functions. The prediction accuracy of a deep neural network (DNN) with an improved Swish function is 0.38% more than that of a deep neural network (DNN) with a regular Swish function. For the DNN with the improved Swish function, the mean square error of the prediction for the flatness of cold rolled strip is reduced to 65% of the regular DNN. The accuracy of the improved DNN is up to and higher than the industrial requirements. The shape prediction of the improved DNN will assist and guide the industrial production process, reducing the scrap yield and industrial cost.

Download Full-text