STUDY OF ACTIVATION FUNCTIONS IN NEURAL NETWORKS FOR MINIMIZING THE LOSSES IN THE POWER FLOW SYSTEM

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.

Download Full-text

Application of supervised learning artificial neural networks [CPNN, BPNN] for solving power flow problem

IET-UK International Conference on Information and Communication Technology in Electrical Sciences (ICTES 2007) ◽

10.1049/ic:20070603 ◽

2007 ◽

Cited By ~ 5

Author(s):

A. Rathinam ◽

S. Padmini ◽

V. Ravikumar

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Supervised Learning ◽

Power Flow ◽

Flow Problem ◽

Artificial Neural ◽

Power Flow Problem

Download Full-text

Optical Analysis of the UPQC using PI Controller in Power flow System

2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS) ◽

10.1109/icaccs51430.2021.9441901 ◽

2021 ◽

Author(s):

Mamidala Vijay Karthik ◽

M.Kalyan Chakravarthi ◽

Lis M Yapanto ◽

D Selvapandian ◽

R. Harish ◽

...

Keyword(s):

Power Flow ◽

Flow System ◽

Pi Controller ◽

Optical Analysis

Download Full-text

Hardware implementation of radial-basis neural networks with Gaussian activation functions on FPGA

Neural Computing and Applications ◽

10.1007/s00521-021-05706-3 ◽

2021 ◽

Author(s):

Volodymyr Shymkovych ◽

Sergii Telenyk ◽

Petro Kravets

Keyword(s):

Neural Networks ◽

Hardware Implementation ◽

Gaussian Function ◽

Activation Function ◽

Rbf Neural Networks ◽

Activation Functions ◽

Rbf Network ◽

Combination Scheme ◽

Radial Basis ◽

Hidden Layer

AbstractThis article introduces a method for realizing the Gaussian activation function of radial-basis (RBF) neural networks with their hardware implementation on field-programmable gaits area (FPGAs). The results of modeling of the Gaussian function on FPGA chips of different families have been presented. RBF neural networks of various topologies have been synthesized and investigated. The hardware component implemented by this algorithm is an RBF neural network with four neurons of the latent layer and one neuron with a sigmoid activation function on an FPGA using 16-bit numbers with a fixed point, which took 1193 logic matrix gate (LUTs—LookUpTable). Each hidden layer neuron of the RBF network is designed on an FPGA as a separate computing unit. The speed as a total delay of the combination scheme of the block RBF network was 101.579 ns. The implementation of the Gaussian activation functions of the hidden layer of the RBF network occupies 106 LUTs, and the speed of the Gaussian activation functions is 29.33 ns. The absolute error is ± 0.005. The Spartan 3 family of chips for modeling has been used to get these results. Modeling on chips of other series has been also introduced in the article. RBF neural networks of various topologies have been synthesized and investigated. Hardware implementation of RBF neural networks with such speed allows them to be used in real-time control systems for high-speed objects.

Download Full-text

Trigonometric Inference Providing Learning in Deep Neural Networks

Applied Sciences ◽

10.3390/app11156704 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6704

Author(s):

Jingyong Cai ◽

Masashi Takemoto ◽

Yuming Qiu ◽

Hironori Nakajo

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Trigonometric Approximation ◽

Model Parameters ◽

Training Algorithms ◽

Activation Functions ◽

Classical Training ◽

Sum Formula

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.

Download Full-text

On extended dissipativity analysis for neural networks with time-varying delay and general activation functions

Advances in Difference Equations ◽

10.1186/s13662-016-0769-7 ◽

2016 ◽

Vol 2016 (1) ◽

Cited By ~ 4

Author(s):

Xin Wang ◽

Kun She ◽

Shouming Zhong ◽

Jun Cheng

Keyword(s):

Neural Networks ◽

Time Varying ◽

Activation Functions ◽

Time Varying Delay ◽

General Activation ◽

Dissipativity Analysis ◽

Varying Delay

Download Full-text

Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems

IEEE Transactions on Neural Networks ◽

10.1109/72.392253 ◽

1995 ◽

Vol 6 (4) ◽

pp. 911-917 ◽

Cited By ~ 235

Author(s):

Tianping Chen ◽

Hong Chen

Keyword(s):

Neural Networks ◽

Dynamical Systems ◽

Universal Approximation ◽

Activation Functions ◽

Nonlinear Operators

Download Full-text

Improved Effort and Cost Estimation Model Using Artificial Neural Networks and Taguchi Method with Different Activation Functions

Entropy ◽

10.3390/e23070854 ◽

2021 ◽

Vol 23 (7) ◽

pp. 854

Author(s):

Nevena Rankovic ◽

Dragica Rankovic ◽

Mirjana Ivanovic ◽

Ljubomir Lazic

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Cost Estimation ◽

Time Estimation ◽

Effort Estimation ◽

Activation Functions ◽

Estimation Model ◽

Wide Range ◽

Software Product ◽

Artificial Neural

Software estimation involves meeting a huge number of different requirements, such as resource allocation, cost estimation, effort estimation, time estimation, and the changing demands of software product customers. Numerous estimation models try to solve these problems. In our experiment, a clustering method of input values to mitigate the heterogeneous nature of selected projects was used. Additionally, homogeneity of the data was achieved with the fuzzification method, and we proposed two different activation functions inside a hidden layer, during the construction of artificial neural networks (ANNs). In this research, we present an experiment that uses two different architectures of ANNs, based on Taguchi’s orthogonal vector plans, to satisfy the set conditions, with additional methods and criteria for validation of the proposed model, in this approach. The aim of this paper is the comparative analysis of the obtained results of mean magnitude relative error (MMRE) values. At the same time, our goal is also to find a relatively simple architecture that minimizes the error value while covering a wide range of different software projects. For this purpose, six different datasets are divided into four chosen clusters. The obtained results show that the estimation of diverse projects by dividing them into clusters can contribute to an efficient, reliable, and accurate software product assessment. The contribution of this paper is in the discovered solution that enables the execution of a small number of iterations, which reduces the execution time and achieves the minimum error.

Download Full-text

ACCELERATING TRAINING OF FEEDFORWARD NEURAL NETWORKS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213094000170 ◽

1994 ◽

Vol 03 (03) ◽

pp. 339-348

Author(s):

CARL G. LOONEY

Keyword(s):

Neural Networks ◽

Local Minimum ◽

Random Search ◽

Feedforward Neural Networks ◽

Conjugate Gradients ◽

Activation Functions ◽

Problematic Behavior ◽

Methods And Techniques ◽

Adaptive Step ◽

Quality Learning

We review methods and techniques for training feedforward neural networks that avoid problematic behavior, accelerate the convergence, and verify the training. Adaptive step gain, bipolar activation functions, and conjugate gradients are powerful stabilizers. Random search techniques circumvent the local minimum trap and avoid specialization due to overtraining. Testing assures quality learning.

Download Full-text