scholarly journals Learning algorithm analysis for deep neural network with ReLu activation functions

2018 ◽  
Vol 19 ◽  
pp. 01009
Author(s):  
Stanisław Płaczek ◽  
Aleksander Płaczek

In the article, emphasis is put on the modern artificial neural network structure, which in the literature is known as a deep neural network. Network includes more than one hidden layer and comprises many standard modules with ReLu nonlinear activation function. A learning algorithm includes two standard steps, forward and backward, and its effectiveness depends on the way the learning error is transported back through all the layers to the first layer. Taking into account all the dimensionalities of matrixes and the nonlinear characteristics of ReLu activation function, the problem is very difficult from a theoretic point of view. To implement simple assumptions in the analysis, formal formulas are used to describe relations between the structure of every layer and the internal input vector. In practice tasks, neural networks’ internal layer matrixes with ReLu activations function, include a lot of null value of weight coefficients. This phenomenon has a negatives impact on the effectiveness of the learning algorithm convergences. A theoretical analysis could help to build more effective algorithms.

2021 ◽  
pp. 1063293X2110251
Author(s):  
K Vijayakumar ◽  
Vinod J Kadam ◽  
Sudhir Kumar Sharma

Deep Neural Network (DNN) stands for multilayered Neural Network (NN) that is capable of progressively learn the more abstract and composite representations of the raw features of the input data received, with no need for any feature engineering. They are advanced NNs having repetitious hidden layers between the initial input and the final layer. The working principle of such a standard deep classifier is based on a hierarchy formed by the composition of linear functions and a defined nonlinear Activation Function (AF). It remains uncertain (not clear) how the DNN classifier can function so well. But it is clear from many studies that within DNN, the AF choice has a notable impact on the kinetics of training and the success of tasks. In the past few years, different AFs have been formulated. The choice of AF is still an area of active study. Hence, in this study, a novel deep Feed forward NN model with four AFs has been proposed for breast cancer classification: hidden layer 1: Swish, hidden layer, 2:-LeakyReLU, hidden layer 3: ReLU, and final output layer: naturally Sigmoidal. The purpose of the study is twofold. Firstly, this study is a step toward a more profound understanding of DNN with layer-wise different AFs. Secondly, research is also aimed to explore better DNN-based systems to build predictive models for breast cancer data with improved accuracy. Therefore, the benchmark UCI dataset WDBC was used for the validation of the framework and evaluated using a ten-fold CV method and various performance indicators. Multiple simulations and outcomes of the experimentations have shown that the proposed solution performs in a better way than the Sigmoid, ReLU, and LeakyReLU and Swish activation DNN in terms of different parameters. This analysis contributes to producing an expert and precise clinical dataset classification method for breast cancer. Furthermore, the model also achieved improved performance compared to many established state-of-the-art algorithms/models.


Author(s):  
H. T. Do ◽  
V. Raghavan ◽  
G. Yonezawa

<p><strong>Abstract.</strong> In this paper, we present the identification of terrace field by using Feed-forward back propagation deep neural network in pixel-based and several cases of object-based approaches. Terrace field of Lao Cai area in Vietnam is identified from 5-meter RapidEye image. The image includes 5 bands: red, green, blue, rededge and nir-infrared. Reference data are set of terrace points and nonterrace points, which are generated by randomly selected from reference map. The reference data is separated into three sets: training set for training processing, validation set for generating optimal parameters of deep neural network model, and test set for assessing the accuracy of classification. Six optimal thresholds (T): 0.06, 0.09, 0.12, 0.14, 0.2 and 0.22 are chosen from Rate of Change graph, and then used to generate six cases of object-based classification. Deep neural network (DNN) model is built with 8 hidden layers, input units are 5 bands of RapidEye, and output is terrace and non-terrace classes. Each hidden layer includes 256 units – a large number, to avoid under-fitting. Activation function is Rectifier. Dropout and two regularization parameters are applied to avoid overfitting. Seven terrace maps are generated. The classification results show that the DNN is able to identify terrace field effectively in both pixel-based and object-based approaches. Pixel-based classification is the most accurate approach, achieves 90% accuracy. The values of object-based approaches are 88.5%, 87.3%, 86.7%, 86.6%, 85% and 85.3% correspond to the segmentation thresholds.</p>


2021 ◽  
Vol 10 (1) ◽  
pp. 21
Author(s):  
Omar Nassef ◽  
Toktam Mahmoodi ◽  
Foivos Michelinakis ◽  
Kashif Mahmood ◽  
Ahmed Elmokashfi

This paper presents a data driven framework for performance optimisation of Narrow-Band IoT user equipment. The proposed framework is an edge micro-service that suggests one-time configurations to user equipment communicating with a base station. Suggested configurations are delivered from a Configuration Advocate, to improve energy consumption, delay, throughput or a combination of those metrics, depending on the user-end device and the application. Reinforcement learning utilising gradient descent and genetic algorithm is adopted synchronously with machine and deep learning algorithms to predict the environmental states and suggest an optimal configuration. The results highlight the adaptability of the Deep Neural Network in the prediction of intermediary environmental states, additionally the results present superior performance of the genetic reinforcement learning algorithm regarding its performance optimisation.


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 949
Author(s):  
Jiangyi Wang ◽  
Min Liu ◽  
Xinwu Zeng ◽  
Xiaoqiang Hua

Convolutional neural networks have powerful performances in many visual tasks because of their hierarchical structures and powerful feature extraction capabilities. SPD (symmetric positive definition) matrix is paid attention to in visual classification, because it has excellent ability to learn proper statistical representation and distinguish samples with different information. In this paper, a deep neural network signal detection method based on spectral convolution features is proposed. In this method, local features extracted from convolutional neural network are used to construct the SPD matrix, and a deep learning algorithm for the SPD matrix is used to detect target signals. Feature maps extracted by two kinds of convolutional neural network models are applied in this study. Based on this method, signal detection has become a binary classification problem of signals in samples. In order to prove the availability and superiority of this method, simulated and semi-physical simulated data sets are used. The results show that, under low SCR (signal-to-clutter ratio), compared with the spectral signal detection method based on the deep neural network, this method can obtain a gain of 0.5–2 dB on simulated data sets and semi-physical simulated data sets.


2021 ◽  
Vol 26 (jai2021.26(1)) ◽  
pp. 32-41
Author(s):  
Bodyanskiy Y ◽  
◽  
Antonenko T ◽  

Modern approaches in deep neural networks have a number of issues related to the learning process and computational costs. This article considers the architecture grounded on an alternative approach to the basic unit of the neural network. This approach achieves optimization in the calculations and gives rise to an alternative way to solve the problems of the vanishing and exploding gradient. The main issue of the article is the usage of the deep stacked neo-fuzzy system, which uses a generalized neo-fuzzy neuron to optimize the learning process. This approach is non-standard from a theoretical point of view, so the paper presents the necessary mathematical calculations and describes all the intricacies of using this architecture from a practical point of view. From a theoretical point, the network learning process is fully disclosed. Derived all necessary calculations for the use of the backpropagation algorithm for network training. A feature of the network is the rapid calculation of the derivative for the activation functions of neurons. This is achieved through the use of fuzzy membership functions. The paper shows that the derivative of such function is a constant, and this is a reason for the statement of increasing in the optimization rate in comparison with neural networks which use neurons with more common activation functions (ReLU, sigmoid). The paper highlights the main points that can be improved in further theoretical developments on this topic. In general, these issues are related to the calculation of the activation function. The proposed methods cope with these points and allow approximation using the network, but the authors already have theoretical justifications for improving the speed and approximation properties of the network. The results of the comparison of the proposed network with standard neural network architectures are shown


2019 ◽  
Vol 2 (1) ◽  
pp. 1
Author(s):  
Hijratul Aini ◽  
Haviluddin Haviluddin

Crude palm oil (CPO) production at PT. Perkebunan Nusantara (PTPN) XIII from January 2015 to January 2018 have been treated. This paper aims to predict CPO production using intelligent algorithms called Backpropagation Neural Network (BPNN). The accuracy of prediction algorithms have been measured by mean square error (MSE). The experiment showed that the best hidden layer architecture (HLA) is 5-10-11-12-13-1 with learning function (LF) of trainlm, activation function (AF) of logsig and purelin, and learning rate (LR) of 0.5. This architecture has a good accuracy with MSE of 0.0643. The results showed that this model can predict CPO production in 2019.


2016 ◽  
Vol 36 (2) ◽  
pp. 172-178 ◽  
Author(s):  
Liang Chen ◽  
Leitao Cui ◽  
Rong Huang ◽  
Zhengyun Ren

Purpose This paper aims to present a bio-inspired neural network for improvement of information processing capability of the existing artificial neural networks. Design/methodology/approach In the network, the authors introduce a property often found in biological neural system – hysteresis – as the neuron activation function and a bionic algorithm – extreme learning machine (ELM) – as the learning scheme. The authors give the gradient descent procedure to optimize parameters of the hysteretic function and develop an algorithm to online select ELM parameters, including number of the hidden-layer nodes and hidden-layer parameters. The algorithm combines the idea of the cross validation and random assignment in original ELM. Finally, the authors demonstrate the advantages of the hysteretic ELM neural network by applying it to automatic license plate recognition. Findings Experiments on automatic license plate recognition show that the bio-inspired learning system has better classification accuracy and generalization capability with consideration to efficiency. Originality/value Comparing with the conventional sigmoid function, hysteresis as the activation function enables has two advantages: the neuron’s output not only depends on its input but also on derivative information, which provides the neuron with memory; the hysteretic function can switch between the two segments, thus avoiding the neuron falling into local minima and having a quicker learning rate. The improved ELM algorithm in some extent makes up for declining performance because of original ELM’s complete randomness with the cost of a litter slower than before.


Sign in / Sign up

Export Citation Format

Share Document