Training Pi-Sigma Network by Online Gradient Algorithm with Penalty for Small Weight Update

2007 ◽  
Vol 19 (12) ◽  
pp. 3356-3368 ◽  
Author(s):  
Yan Xiong ◽  
Wei Wu ◽  
Xidai Kang ◽  
Chao Zhang

A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.

Author(s):  
Nikolay Anatolievich Vershkov ◽  
Mikhail Grigoryevich Babenko ◽  
Viktor Andreevich Kuchukov ◽  
Natalia Nikolaevna Kuchukova

The article deals with the problem of recognition of handwritten digits using feedforward neural networks (perceptrons) using a correlation indicator. The proposed method is based on the mathematical model of the neural network as an oscillatory system similar to the information transmission system. The article uses theoretical developments of the authors to search for the global extremum of the error function in artificial neural networks. The handwritten digit image is considered as a one-dimensional input discrete signal representing a combination of "perfect digit writing" and noise, which describes the deviation of the input implementation from "perfect writing". The ideal observer criterion (Kotelnikov criterion), which is widely used in information transmission systems and describes the probability of correct recognition of the input signal, is used to form the loss function. In the article is carried out a comparative analysis of the convergence of learning and experimentally obtained sequences on the basis of the correlation indicator and widely used in the tasks of classification of the function CrossEntropyLoss with the use of the optimizer and without it. Based on the experiments carried out, it is concluded that the proposed correlation indicator has an advantage of 2-3 times.


2010 ◽  
Vol 2010 ◽  
pp. 1-27
Author(s):  
Huisheng Zhang ◽  
Dongpo Xu ◽  
Zhiping Wang

The online gradient method has been widely used in training neural networks. We consider in this paper an online split-complex gradient algorithm for complex-valued neural networks. We choose an adaptive learning rate during the training procedure. Under certain conditions, by firstly showing the monotonicity of the error function, it is proved that the gradient of the error function tends to zero and the weight sequence tends to a fixed point. A numerical example is given to support the theoretical findings.


2009 ◽  
Vol 2009 ◽  
pp. 1-22 ◽  
Author(s):  
C. D. Tilakaratne ◽  
M. A. Mammadov ◽  
S. A. Morris

The aim of this paper is to present modified neural network algorithms to predict whether it is best to buy, hold, or sell shares (trading signals) of stock market indices. Most commonly used classification techniques are not successful in predicting trading signals when the distribution of the actual trading signals, among these three classes, is imbalanced. The modified network algorithms are based on the structure of feedforward neural networks and a modified Ordinary Least Squares (OLSs) error function. An adjustment relating to the contribution from the historical data used for training the networks and penalisation of incorrectly classified trading signals were accounted for, when modifying the OLS function. A global optimization algorithm was employed to train these networks. These algorithms were employed to predict the trading signals of the Australian All Ordinary Index. The algorithms with the modified error functions introduced by this study produced better predictions.


2020 ◽  
Vol 9 (1) ◽  
pp. 41-49
Author(s):  
Johanes Roisa Prabowo ◽  
Rukun Santoso ◽  
Hasbi Yasin

House is one aspect of the welfare of society that must be met, because house is the main need for human life besides clothing and food. The condition of the house as a good shelter can be known from the structure and facilities of buildings. This research aims to analyze the classification of house conditions is livable or not livable. The method used is artificial neural networks (ANN). ANN is a system information processing that has characteristics similar to biological neural networks. In this research the optimization method used is the conjugate gradient algorithm. The data used are data of Survei Sosial Ekonomi Nasional (Susenas) March 2018 Kor Keterangan Perumahan for Cilacap Regency. The data is divided into training data and testing data with the proportion that gives the highest average accuracy is 90% for training data and 10% for testing data. The best architecture obtained a model consisting of 8 neurons in input layer, 10 neurons in hidden layer and 1 neuron in output layer. The activation function used are bipolar sigmoid in the hidden layer and binary sigmoid in the output layer. The results of the analysis showed that ANN works very well for classification on house conditions in Cilacap Regency with an average accuracy of 98.96% at the training stage and 97.58% at the testing stage.Keywords: House, Classification, Artificial Neural Networks, Conjugate Gradient


2012 ◽  
Vol 2012 ◽  
pp. 1-25 ◽  
Author(s):  
Dakun Yang ◽  
Wei Wu

In many applications, it is natural to use interval data to describe various kinds of uncertainties. This paper is concerned with an interval neural network with a hidden layer. For the original interval neural network, it might cause oscillation in the learning procedure as indicated in our numerical experiments. In this paper, a smoothing interval neural network is proposed to prevent the weights oscillation during the learning procedure. Here, by smoothing we mean that, in a neighborhood of the origin, we replace the absolute values of the weights by a smooth function of the weights in the hidden layer and output layer. The convergence of a gradient algorithm for training the smoothing interval neural network is proved. Supporting numerical experiments are provided.


1992 ◽  
Vol 4 (4) ◽  
pp. 473-493 ◽  
Author(s):  
Steven J. Nowlan ◽  
Geoffrey E. Hinton

One way of simplifying neural networks so they generalize better is to add an extra term to the error function that will penalize complexity. Simple versions of this approach include penalizing the sum of the squares of the weights or penalizing the number of nonzero weights. We propose a more complicated penalty term in which the distribution of weight values is modeled as a mixture of multiple gaussians. A set of weights is simple if the weights have high probability density under the mixture model. This can be achieved by clustering the weights into subsets with the weights in each cluster having very similar values. Since we do not know the appropriate means or variances of the clusters in advance, we allow the parameters of the mixture model to adapt at the same time as the network learns. Simulations on two different problems demonstrate that this complexity term is more effective than previous complexity terms.


Sign in / Sign up

Export Citation Format

Share Document