scholarly journals ReLU Network with Bounded Width Is a Universal Approximator in View of an Approximate Identity

2021 ◽  
Vol 11 (1) ◽  
pp. 427
Author(s):  
Sunghwan Moon

Deep neural networks have shown very successful performance in a wide range of tasks, but a theory of why they work so well is in the early stage. Recently, the expressive power of neural networks, important for understanding deep learning, has received considerable attention. Classic results, provided by Cybenko, Barron, etc., state that a network with a single hidden layer and suitable activation functions is a universal approximator. A few years ago, one started to study how width affects the expressiveness of neural networks, i.e., a universal approximation theorem for a deep neural network with a Rectified Linear Unit (ReLU) activation function and bounded width. Here, we show how any continuous function on a compact set of Rnin,nin∈N can be approximated by a ReLU network having hidden layers with at most nin+5 nodes in view of an approximate identity.

2019 ◽  
Vol 12 (3) ◽  
pp. 156-161 ◽  
Author(s):  
Aman Dureja ◽  
Payal Pahwa

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.


Author(s):  
Volodymyr Shymkovych ◽  
Sergii Telenyk ◽  
Petro Kravets

AbstractThis article introduces a method for realizing the Gaussian activation function of radial-basis (RBF) neural networks with their hardware implementation on field-programmable gaits area (FPGAs). The results of modeling of the Gaussian function on FPGA chips of different families have been presented. RBF neural networks of various topologies have been synthesized and investigated. The hardware component implemented by this algorithm is an RBF neural network with four neurons of the latent layer and one neuron with a sigmoid activation function on an FPGA using 16-bit numbers with a fixed point, which took 1193 logic matrix gate (LUTs—LookUpTable). Each hidden layer neuron of the RBF network is designed on an FPGA as a separate computing unit. The speed as a total delay of the combination scheme of the block RBF network was 101.579 ns. The implementation of the Gaussian activation functions of the hidden layer of the RBF network occupies 106 LUTs, and the speed of the Gaussian activation functions is 29.33 ns. The absolute error is ± 0.005. The Spartan 3 family of chips for modeling has been used to get these results. Modeling on chips of other series has been also introduced in the article. RBF neural networks of various topologies have been synthesized and investigated. Hardware implementation of RBF neural networks with such speed allows them to be used in real-time control systems for high-speed objects.


2022 ◽  
pp. 202-226
Author(s):  
Leema N. ◽  
Khanna H. Nehemiah ◽  
Elgin Christo V. R. ◽  
Kannan A.

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.


Agriculture ◽  
2020 ◽  
Vol 10 (11) ◽  
pp. 567
Author(s):  
Jolanta Wawrzyniak

Artificial neural networks (ANNs) constitute a promising modeling approach that may be used in control systems for postharvest preservation and storage processes. The study investigated the ability of multilayer perceptron and radial-basis function ANNs to predict fungal population levels in bulk stored rapeseeds with various temperatures (T = 12–30 °C) and water activity in seeds (aw = 0.75–0.90). The neural network model input included aw, temperature, and time, whilst the fungal population level was the model output. During the model construction, networks with a different number of hidden layer neurons and different configurations of activation functions in neurons of the hidden and output layers were examined. The best architecture was the multilayer perceptron ANN, in which the hyperbolic tangent function acted as an activation function in the hidden layer neurons, while the linear function was the activation function in the output layer neuron. The developed structure exhibits high prediction accuracy and high generalization capability. The model provided in the research may be readily incorporated into control systems for postharvest rapeseed preservation and storage as a support tool, which based on easily measurable on-line parameters can estimate the risk of fungal development and thus mycotoxin accumulation.


1996 ◽  
Vol 8 (1) ◽  
pp. 164-177 ◽  
Author(s):  
H. N. Mhaskar

We prove that neural networks with a single hidden layer are capable of providing an optimal order of approximation for functions assumed to possess a given number of derivatives, if the activation function evaluated by each principal element satisfies certain technical conditions. Under these conditions, it is also possible to construct networks that provide a geometric order of approximation for analytic target functions. The permissible activation functions include the squashing function (1 − e−x)−1 as well as a variety of radial basis functions. Our proofs are constructive. The weights and thresholds of our networks are chosen independently of the target function; we give explicit formulas for the coefficients as simple, continuous, linear functionals of the target function.


Author(s):  
M. HARLY ◽  
I. N. SUTANTRA ◽  
H. P. MAURIDHI

Fixed order neural networks (FONN), such as high order neural network (HONN), in which its architecture is developed from zero order of activation function and joint weight, regulates only the number of weight and their value. As a result, this network only produces a fixed order model or control level. These obstacles, which affect preceeding architectures, have been performing finite ability to adapt uncertainty character of real world plant, such as driving dynamics and its desired control performance. This paper introduces a new concept of neural network neuron. In this matter, exploiting discrete z-function builds new neuron activation. Instead of zero order joint weight matrices, the discrete z-function weight matrix will be provided to realize uncertainty or undetermined real word plant and desired adaptive control system that their order has probably been changing. Instead of using bias, an initial condition value is developed. Neural networks using new neurons is called Varied Order Neural Network (VONN). For optimization process, updating order, coefficient and initial value of node activation function uses GA; while updating joint weight, it applies both back propagation (combined LSE-gauss Newton) and NPSO. To estimate the number of hidden layer, constructive back propagation (CBP) was also applied. Thorough simulation was conducted to compare the control performance between FONN and MONN. In order to control, vehicle stability was equipped by electronics stability program (ESP), electronics four wheel steering (4-EWS), and active suspension (AS). 2000, 4000, 6000, 8000 data that are from TODS, a hidden layer, 3 input nodes, 3 output nodes were provided to train and test the network of both the uncertainty model and its adaptive control system. The result of simulation, therefore, shows that stability parameter such as yaw rate error, vehicle side slip error, and rolling angle error produces better performance control in the form of smaller performance index using FDNN than those using MONN.


Author(s):  
Leema N. ◽  
Khanna H. Nehemiah ◽  
Elgin Christo V. R. ◽  
Kannan A.

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.


2007 ◽  
Vol 17 (05) ◽  
pp. 419-424 ◽  
Author(s):  
JINLING LONG ◽  
WEI WU ◽  
DONG NAN

This paper studies the Lp approximation capabilities of sum-of-product (SOPNN) and sigma-pi-sigma (SPSNN) neural networks. It is proved that the set of functions that are generated by the SOPNN with its activation function in [Formula: see text] is dense in [Formula: see text] for any compact set [Formula: see text], if and only if the activation function is not a polynomial almost everywhere. It is also shown that if the activation function of the SPSNN is in [Formula: see text], then the functions generated by the SPSNN are dense in [Formula: see text] if and only if the activation function is not a constant (a.e.).


Acta Numerica ◽  
1994 ◽  
Vol 3 ◽  
pp. 145-202 ◽  
Author(s):  
S.W. Ellacott

This article starts with a brief introduction to neural networks for those unfamiliar with the basic concepts, together with a very brief overview of mathematical approaches to the subject. This is followed by a more detailed look at three areas of research which are of particular interest to numerical analysts.The first area is approximation theory. IfKis a compact set in ℝn, for somen, then it is proved that a semilinear feedforward network with one hidden layer can uniformly approximate any continuous function inC(K) to any required accuracy. A discussion of known results and open questions on the degree of approximation is included. We also consider the relevance of radial basis functions to neural networks.The second area considered is that of learning algorithms. A detailed analysis of one popular algorithm (the delta rule) will be given, indicating why one implementation leads to a stable numerical process, whereas an initially attractive variant (essentially a form of steepest descent) does not. Similar considerations apply to the backpropagation algorithm. The effect of filtering and other preprocessing of the input data will also be discussed systematically.Finally some applications of neural networks to numerical computation are considered.


Author(s):  
Chandu Nereeksha

Today, Heart disease seems to be a great cause for the increasing rate of immortality especially taken the current health situation under consideration. Improving the health conditions using the latest technology makes an enormous amount of contribution to the healthcare industry. One such, improvement is the use of Machine Learning in determining the heart diseases. Machine learning has a wide range of advancement in Neural Networks (NN). Artificial Neural Networks are basically inspired by the working of neural network inside a human brain. Our study aims to use the different algorithms and technologies to predict heart diseases at an early stage. Different data mining algorithms namely, Decision Tree, K-means clustering, Back-propagation and Random Forest are being used. The system classifies data into different stages such as normal and mild or extreme.


Sign in / Sign up

Export Citation Format

Share Document