Training Pi-Sigma Network by Online Gradient Algorithm with Penalty for Small Weight Update

A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.

Download Full-text

Convergence of Batch Gradient Method Based on the Entropy Error Function for Feedforward Neural Networks

Neural Processing Letters ◽

10.1007/s11063-020-10374-w ◽

2020 ◽

Vol 52 (3) ◽

pp. 2687-2695

Author(s):

Yan Xiong ◽

Xin Tong

Keyword(s):

Neural Networks ◽

Gradient Method ◽

Error Function ◽

Feedforward Neural Networks ◽

Entropy Error

Download Full-text

Advanced supervised learning in multi-layer perceptrons to the recognition tasks based on correlation indicator

Proceedings of the Institute for System Programming of RAS ◽

10.15514/ispras-2021-33(1)-2 ◽

2021 ◽

Vol 33 (1) ◽

pp. 33-46

Author(s):

Nikolay Anatolievich Vershkov ◽

Mikhail Grigoryevich Babenko ◽

Viktor Andreevich Kuchukov ◽

Natalia Nikolaevna Kuchukova

Keyword(s):

Neural Networks ◽

Information Transmission ◽

Error Function ◽

Feedforward Neural Networks ◽

Oscillatory System ◽

Ideal Observer ◽

Global Extremum ◽

The Neural Network ◽

Handwritten Digit ◽

The Mathematical Model

The article deals with the problem of recognition of handwritten digits using feedforward neural networks (perceptrons) using a correlation indicator. The proposed method is based on the mathematical model of the neural network as an oscillatory system similar to the information transmission system. The article uses theoretical developments of the authors to search for the global extremum of the error function in artificial neural networks. The handwritten digit image is considered as a one-dimensional input discrete signal representing a combination of "perfect digit writing" and noise, which describes the deviation of the input implementation from "perfect writing". The ideal observer criterion (Kotelnikov criterion), which is widely used in information transmission systems and describes the probability of correct recognition of the input signal, is used to form the loss function. In the article is carried out a comparative analysis of the convergence of learning and experimentally obtained sequences on the basis of the correlation indicator and widely used in the tasks of classification of the function CrossEntropyLoss with the use of the optimizer and without it. Based on the experiments carried out, it is concluded that the proposed correlation indicator has an advantage of 2-3 times.

Download Full-text

Convergence of an Online Split-Complex Gradient Algorithm for Complex-Valued Neural Networks

Discrete Dynamics in Nature and Society ◽

10.1155/2010/829692 ◽

2010 ◽

Vol 2010 ◽

pp. 1-27

Author(s):

Huisheng Zhang ◽

Dongpo Xu ◽

Zhiping Wang

Keyword(s):

Neural Networks ◽

Fixed Point ◽

Adaptive Learning ◽

Gradient Method ◽

Error Function ◽

Gradient Algorithm ◽

Training Procedure ◽

Weight Sequence ◽

Complex Valued ◽

Complex Gradient

The online gradient method has been widely used in training neural networks. We consider in this paper an online split-complex gradient algorithm for complex-valued neural networks. We choose an adaptive learning rate during the training procedure. Under certain conditions, by firstly showing the monotonicity of the error function, it is proved that the gradient of the error function tends to zero and the weight sequence tends to a fixed point. A numerical example is given to support the theoretical findings.

Download Full-text

Fast Conjugate Gradient Algorithm for Feedforward Neural Networks

Artificial Intelligence and Soft Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-61401-0_3 ◽

2020 ◽

pp. 27-38

Author(s):

Jarosław Bilski ◽

Jacek Smoląg

Keyword(s):

Neural Networks ◽

Conjugate Gradient ◽

Feedforward Neural Networks ◽

Conjugate Gradient Algorithm ◽

Gradient Algorithm

Download Full-text

Modified Neural Network Algorithms for Predicting Trading Signals of Stock Market Indices

Journal of Applied Mathematics and Decision Sciences ◽

10.1155/2009/125308 ◽

2009 ◽

Vol 2009 ◽

pp. 1-22 ◽

Cited By ~ 10

Author(s):

C. D. Tilakaratne ◽

M. A. Mammadov ◽

S. A. Morris

Keyword(s):

Neural Network ◽

Neural Networks ◽

Global Optimization ◽

Stock Market ◽

Historical Data ◽

Error Function ◽

Feedforward Neural Networks ◽

Ordinary Least Squares ◽

Global Optimization Algorithm ◽

Network Algorithms

The aim of this paper is to present modified neural network algorithms to predict whether it is best to buy, hold, or sell shares (trading signals) of stock market indices. Most commonly used classification techniques are not successful in predicting trading signals when the distribution of the actual trading signals, among these three classes, is imbalanced. The modified network algorithms are based on the structure of feedforward neural networks and a modified Ordinary Least Squares (OLSs) error function. An adjustment relating to the contribution from the historical data used for training the networks and penalisation of incorrectly classified trading signals were accounted for, when modifying the OLS function. A global optimization algorithm was employed to train these networks. These algorithms were employed to predict the trading signals of the Australian All Ordinary Index. The algorithms with the modified error functions introduced by this study produced better predictions.

Download Full-text

IMPLEMENTASI JARINGAN SYARAF TIRUAN BACKPROPAGATION DENGAN ALGORITMA CONJUGATE GRADIENT UNTUK KLASIFIKASI KONDISI RUMAH (Studi Kasus di Kabupaten Cilacap Tahun 2018)

Jurnal Gaussian ◽

10.14710/j.gauss.v9i1.27522 ◽

2020 ◽

Vol 9 (1) ◽

pp. 41-49

Author(s):

Johanes Roisa Prabowo ◽

Rukun Santoso ◽

Hasbi Yasin

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Conjugate Gradient ◽

Training Data ◽

Gradient Algorithm ◽

Output Layer ◽

Average Accuracy ◽

Testing Data ◽

Artificial Neural ◽

Hidden Layer

House is one aspect of the welfare of society that must be met, because house is the main need for human life besides clothing and food. The condition of the house as a good shelter can be known from the structure and facilities of buildings. This research aims to analyze the classification of house conditions is livable or not livable. The method used is artificial neural networks (ANN). ANN is a system information processing that has characteristics similar to biological neural networks. In this research the optimization method used is the conjugate gradient algorithm. The data used are data of Survei Sosial Ekonomi Nasional (Susenas) March 2018 Kor Keterangan Perumahan for Cilacap Regency. The data is divided into training data and testing data with the proportion that gives the highest average accuracy is 90% for training data and 10% for testing data. The best architecture obtained a model consisting of 8 neurons in input layer, 10 neurons in hidden layer and 1 neuron in output layer. The activation function used are bipolar sigmoid in the hidden layer and binary sigmoid in the output layer. The results of the analysis showed that ANN works very well for classification on house conditions in Cilacap Regency with an average accuracy of 98.96% at the training stage and 97.58% at the testing stage.Keywords: House, Classification, Artificial Neural Networks, Conjugate Gradient

Download Full-text

A novel approach to error function minimization for feedforward neural networks

Nuclear Instruments and Methods in Physics Research Section A Accelerators Spectrometers Detectors and Associated Equipment ◽

10.1016/0168-9002(95)00247-2 ◽

1995 ◽

Vol 361 (1-2) ◽

pp. 290-296 ◽

Cited By ~ 18

Author(s):

Ralph Sinkus

Keyword(s):

Neural Networks ◽

Error Function ◽

Feedforward Neural Networks ◽

Function Minimization ◽

Novel Approach

Download Full-text

Binary Output Layer of Feedforward Neural Networks for Solving Multi-Class Classification Problems

IEEE Access ◽

10.1109/access.2018.2888852 ◽

2019 ◽

Vol 7 ◽

pp. 5085-5094 ◽

Cited By ~ 3

Author(s):

Sibo Yang ◽

Chao Zhang ◽

Wei Wu

Keyword(s):

Neural Networks ◽

Feedforward Neural Networks ◽

Classification Problems ◽

Binary Output ◽

Output Layer ◽

Multi Class Classification

Download Full-text

A Smoothing Interval Neural Network

Discrete Dynamics in Nature and Society ◽

10.1155/2012/456919 ◽

2012 ◽

Vol 2012 ◽

pp. 1-25 ◽

Cited By ~ 7

Author(s):

Dakun Yang ◽

Wei Wu

Keyword(s):

Neural Network ◽

Smooth Function ◽

Numerical Experiments ◽

Interval Data ◽

Gradient Algorithm ◽

Output Layer ◽

Learning Procedure ◽

The Absolute ◽

Hidden Layer ◽

Interval Neural Network

In many applications, it is natural to use interval data to describe various kinds of uncertainties. This paper is concerned with an interval neural network with a hidden layer. For the original interval neural network, it might cause oscillation in the learning procedure as indicated in our numerical experiments. In this paper, a smoothing interval neural network is proposed to prevent the weights oscillation during the learning procedure. Here, by smoothing we mean that, in a neighborhood of the origin, we replace the absolute values of the weights by a smooth function of the weights in the hidden layer and output layer. The convergence of a gradient algorithm for training the smoothing interval neural network is proved. Supporting numerical experiments are provided.

Download Full-text

Simplifying Neural Networks by Soft Weight-Sharing

Neural Computation ◽

10.1162/neco.1992.4.4.473 ◽

1992 ◽

Vol 4 (4) ◽

pp. 473-493 ◽

Cited By ~ 281

Author(s):

Steven J. Nowlan ◽

Geoffrey E. Hinton

Keyword(s):

Neural Networks ◽

Probability Density ◽

Mixture Model ◽

High Probability ◽

Error Function ◽

Penalty Term ◽

High Probability Density ◽

Extra Term

One way of simplifying neural networks so they generalize better is to add an extra term to the error function that will penalize complexity. Simple versions of this approach include penalizing the sum of the squares of the weights or penalizing the number of nonzero weights. We propose a more complicated penalty term in which the distribution of weight values is modeled as a mixture of multiple gaussians. A set of weights is simple if the weights have high probability density under the mixture model. This can be achieved by clustering the weights into subsets with the weights in each cluster having very similar values. Since we do not know the appropriate means or variances of the clusters in advance, we allow the parameters of the mixture model to adapt at the same time as the network learns. Simulations on two different problems demonstrate that this complexity term is more effective than previous complexity terms.

Download Full-text