Deep neural network based on generalized neo-fuzzy neurons and its learning based on backpropagation

2021 ◽  
Vol 26 (jai2021.26(1)) ◽  
pp. 32-41
Author(s):  
Bodyanskiy Y ◽  
◽  
Antonenko T ◽  

Modern approaches in deep neural networks have a number of issues related to the learning process and computational costs. This article considers the architecture grounded on an alternative approach to the basic unit of the neural network. This approach achieves optimization in the calculations and gives rise to an alternative way to solve the problems of the vanishing and exploding gradient. The main issue of the article is the usage of the deep stacked neo-fuzzy system, which uses a generalized neo-fuzzy neuron to optimize the learning process. This approach is non-standard from a theoretical point of view, so the paper presents the necessary mathematical calculations and describes all the intricacies of using this architecture from a practical point of view. From a theoretical point, the network learning process is fully disclosed. Derived all necessary calculations for the use of the backpropagation algorithm for network training. A feature of the network is the rapid calculation of the derivative for the activation functions of neurons. This is achieved through the use of fuzzy membership functions. The paper shows that the derivative of such function is a constant, and this is a reason for the statement of increasing in the optimization rate in comparison with neural networks which use neurons with more common activation functions (ReLU, sigmoid). The paper highlights the main points that can be improved in further theoretical developments on this topic. In general, these issues are related to the calculation of the activation function. The proposed methods cope with these points and allow approximation using the network, but the authors already have theoretical justifications for improving the speed and approximation properties of the network. The results of the comparison of the proposed network with standard neural network architectures are shown

Author(s):  
Wang Haoxiang ◽  
Smys S

Recently, the deep neural networks (DNN) have demonstrated many performances in the pattern recognition paradigm. The research studies on DNN include depth layer networks, filters, training and testing datasets. Deep neural network is providing many solutions for nonlinear partial differential equations (PDE). This research article comprises of many activation functions for each neuron. Besides, these activation networks are allowing many neurons within the neuron networks. In this network, the multitude of the functions will be selected between node by node to minimize the classification error. This is the reason for selecting the adaptive activation function for deep neural networks. Therefore, the activation functions are adapted with every neuron on the network, which is used to reduce the classification error during the process. This research article discusses the scaling factor for activation function that provides better optimization for the process in the dynamic changes of procedure. The proposed adaptive activation function has better learning capability than fixed activation function in any neural network. The research articles compare the convergence rate, early training function, and accuracy between existing methods. Besides, this research work provides improvements in debt ideas of the learning process of various neural networks. This learning process works and tests the solution available in the domain of various frequency bands. In addition to that, both forward and inverse problems of the parameters in the overriding equation will be identified. The proposed method is very simple architecture and efficiency, robustness, and accuracy will be high when considering the nonlinear function. The overall classification performance will be improved in the resulting networks, which have been trained with common datasets. The proposed work is compared with the recent findings in neuroscience research and proved better performance.


2022 ◽  
pp. 202-226
Author(s):  
Leema N. ◽  
Khanna H. Nehemiah ◽  
Elgin Christo V. R. ◽  
Kannan A.

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.


2022 ◽  
Vol 6 (POPL) ◽  
pp. 1-29
Author(s):  
Zi Wang ◽  
Aws Albarghouthi ◽  
Gautam Prakriya ◽  
Somesh Jha

To verify safety and robustness of neural networks, researchers have successfully applied abstract interpretation , primarily using the interval abstract domain. In this paper, we study the theoretical power and limits of the interval domain for neural-network verification. First, we introduce the interval universal approximation (IUA) theorem. IUA shows that neural networks not only can approximate any continuous function f (universal approximation) as we have known for decades, but we can find a neural network, using any well-behaved activation function, whose interval bounds are an arbitrarily close approximation of the set semantics of f (the result of applying f to a set of inputs). We call this notion of approximation interval approximation . Our theorem generalizes the recent result of Baader et al. from ReLUs to a rich class of activation functions that we call squashable functions . Additionally, the IUA theorem implies that we can always construct provably robust neural networks under ℓ ∞ -norm using almost any practical activation function. Second, we study the computational complexity of constructing neural networks that are amenable to precise interval analysis. This is a crucial question, as our constructive proof of IUA is exponential in the size of the approximation domain. We boil this question down to the problem of approximating the range of a neural network with squashable activation functions. We show that the range approximation problem (RA) is a Δ 2 -intermediate problem, which is strictly harder than NP -complete problems, assuming coNP ⊄ NP . As a result, IUA is an inherently hard problem : No matter what abstract domain or computational tools we consider to achieve interval approximation, there is no efficient construction of such a universal approximator. This implies that it is hard to construct a provably robust network, even if we have a robust network to start with.


Author(s):  
Leema N. ◽  
Khanna H. Nehemiah ◽  
Elgin Christo V. R. ◽  
Kannan A.

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.


2019 ◽  
Author(s):  
Vladimír Kunc ◽  
Jiří Kléma

AbstractMotivationGene expression profiling was made cheaper by the NIH LINCS program that profiles only ~1, 000 selected landmark genes and uses them to reconstruct the whole profile. The D–GEX method employs neural networks to infer the whole profile. However, the original D–GEX can be further significantly improved.ResultsWe have analyzed the D–GEX method and determined that the inference can be improved using a logistic sigmoid activation function instead of the hyperbolic tangent. Moreover, we propose a novel transformative adaptive activation function that improves the gene expression inference even further and which generalizes several existing adaptive activation functions. Our improved neural network achieves average mean absolute error of 0.1340 which is a significant improvement over our reimplementation of the original D–GEX which achieves average mean absolute error 0.1637


Author(s):  
M. G. Epitropakis ◽  
V. P. Plagianakos ◽  
Michael N. Vrahatis

This chapter aims to further explore the capabilities of the Higher Order Neural Networks class and especially the Pi-Sigma Neural Networks. The performance of Pi-Sigma Networks is evaluated through several well known neural network training benchmarks. In the experiments reported here, Distributed Evolutionary Algorithms are implemented for Pi-Sigma neural networks training. More specifically, the distributed versions of the Differential Evolution and the Particle Swarm Optimization algorithms have been employed. To this end, each processor of a distributed computing environment is assigned a subpopulation of potential solutions. The subpopulations are independently evolved in parallel and occasional migration is allowed to facilitate the cooperation between them. The novelty of the proposed approach is that it is applied to train Pi-Sigma networks using threshold activation functions, while the weights and biases were confined in a narrow band of integers (constrained in the range [-32, 32]). Thus, the trained Pi-Sigma neural networks can be represented by using only 6 bits. Such networks are better suited for hardware implementation than the real weight ones and this property is very important in real-life applications. Experimental results suggest that the proposed training process is fast, stable and reliable and the distributed trained Pi-Sigma networks exhibit good generalization capabilities.


Author(s):  
Jamilu Adamu

Activation Functions are crucial parts of the Deep Learning Artificial Neural Networks. From the Biological point of view, a neuron is just a node with many inputs and one output. A neural network consists of many interconnected neurons. It is a “simple” device that receives data at the input and provides a response. The function of neurons is to process and transmit information; the neuron is the basic unit in the nervous system. Carly Vandergriendt (2018) stated the human brain at birth consists of an estimated 100 billion Neurons. The ability of a machine to mimic human intelligence is called Machine Learning. Deep Learning Artificial Neural Networks was designed to work like a human brain with the aid of arbitrary choice of Non-linear Activation Functions. Currently, there is no rule of thumb on the choice of Activation Functions, “Try out different things and see what combinations lead to the best performance”, however, sincerely; the choice of Activation Functions should not be Trial and error. Jamilu (2019) proposed that Activation Functions shall be emanated from AI-ML-Purified Data Set and its choice shall satisfy Jameel’s ANNAF Stochastic and or Deterministic Criterion. The objectives of this paper are to propose instances where Deep Learning Artificial Neural Networks are SUPERINTELLIGENT. Using Jameel’s ANNAF Stochastic and or Deterministic Criterion, the paper proposed four classes where Deep Learning Artificial Neural Networks are Superintelligent namely; Stochastic Superintelligent, Deterministic Superintelligent, and Stochastic-Deterministic 1st and 2nd Levels Superintelligence. Also, a Normal Probabilistic-Deterministic case was proposed.


Author(s):  
Kun Huang ◽  
Bingbing Ni ◽  
Xiaokang Yang

Quantization has shown stunning efficiency on deep neural network, especially for portable devices with limited resources. Most existing works uncritically extend weight quantization methods to activations. However, we take the view that best performance can be obtained by applying different quantization methods to weights and activations respectively. In this paper, we design a new activation function dubbed CReLU from the quantization perspective and further complement this design with appropriate initialization method and training procedure. Moreover, we develop a specific quantization strategy in which we formulate the forward and backward approximation of weights with binary values and quantize the activations to low bitwdth using linear or logarithmic quantizer. We show, for the first time, our final quantized model with binary weights and ultra low bitwidth activations outperforms the previous best models by large margins on ImageNet as well as achieving nearly a 10.85× theoretical speedup with ResNet-18. Furthermore, ablation experiments and theoretical analysis demonstrate the effectiveness and robustness of CReLU in comparison with other activation functions.


2010 ◽  
Vol 2010 ◽  
pp. 1-20 ◽  
Author(s):  
Florin Leon ◽  
Mihai Horia Zaharia

A hybrid model for time series forecasting is proposed. It is a stacked neural network, containing one normal multilayer perceptron with bipolar sigmoid activation functions, and the other with an exponential activation function in the output layer. As shown by the case studies, the proposed stacked hybrid neural model performs well on a variety of benchmark time series. The combination of weights of the two stack components that leads to optimal performance is also studied.


Filomat ◽  
2020 ◽  
Vol 34 (15) ◽  
pp. 5009-5018
Author(s):  
Lei Ding ◽  
Lin Xiao ◽  
Kaiqing Zhou ◽  
Yonghong Lan ◽  
Yongsheng Zhang

Compared to the linear activation function, a suitable nonlinear activation function can accelerate the convergence speed. Based on this finding, we propose two modified Zhang neural network (ZNN) models using different nonlinear activation functions to tackle the complex-valued systems of linear equation (CVSLE) problems in this paper. To fulfill this goal, we first propose a novel neural network called NRNN-SBP model by introducing the sign-bi-power activation function. Then, we propose another novel neural network called NRNN-IRN model by introducing the tunable activation function. Finally, simulative results demonstrate that the convergence speed of NRNN-SBP and the NRNN-IRN is faster than that of the FTRNN model. On the other hand, these results also reveal that different nonlinear activation function will have a different effect on the convergence rate for different CVSLE problems.


Sign in / Sign up

Export Citation Format

Share Document