The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks

1990 ◽  
Vol 2 (2) ◽  
pp. 198-209 ◽  
Author(s):  
Marcus Frean

A general method for building and training multilayer perceptrons composed of linear threshold units is proposed. A simple recursive rule is used to build the structure of the network by adding units as they are needed, while a modified perceptron algorithm is used to learn the connection strengths. Convergence to zero errors is guaranteed for any boolean classification on patterns of binary variables. Simulations suggest that this method is efficient in terms of the numbers of units constructed, and the networks it builds can generalize over patterns not in the training set.

Electronics ◽  
2021 ◽  
Vol 10 (22) ◽  
pp. 2761
Author(s):  
Vaios Ampelakiotis ◽  
Isidoros Perikos ◽  
Ioannis Hatzilygeroudis ◽  
George Tsihrintzis

In this paper, we present a handwritten character recognition (HCR) system that aims to recognize first-order logic handwritten formulas and create editable text files of the recognized formulas. Dense feedforward neural networks (NNs) are utilized, and their performance is examined under various training conditions and methods. More specifically, after three training algorithms (backpropagation, resilient propagation and stochastic gradient descent) had been tested, we created and trained an NN with the stochastic gradient descent algorithm, optimized by the Adam update rule, which was proved to be the best, using a trainset of 16,750 handwritten image samples of 28 × 28 each and a testset of 7947 samples. The final accuracy achieved is 90.13%. The general methodology followed consists of two stages: the image processing and the NN design and training. Finally, an application has been created that implements the methodology and automatically recognizes handwritten logic formulas. An interesting feature of the application is that it allows for creating new, user-oriented training sets and parameter settings, and thus new NN models.


1997 ◽  
Vol 9 (1) ◽  
pp. 1-42 ◽  
Author(s):  
Sepp Hochreiter ◽  
Jürgen Schmidhuber

We present a new algorithm for finding low-complexity neural networks with high generalization capability. The algorithm searches for a “flat” minimum of the error function. A flat minimum is a large connected region in weight space where the error remains approximately constant. An MDL-based, Bayesian argument suggests that flat minima correspond to “simple” networks and low expected overfitting. The argument is based on a Gibbs algorithm variant and a novel way of splitting generalization error into underfitting and overfitting error. Unlike many previous approaches, ours does not require gaussian assumptions and does not depend on a “good” weight prior. Instead we have a prior over input output functions, thus taking into account net architecture and training set. Although our algorithm requires the computation of second-order derivatives, it has backpropagation's order of complexity. Automatically, it effectively prunes units, weights, and input lines. Various experiments with feedforward and recurrent nets are described. In an application to stock market prediction, flat minimum search outperforms conventional backprop, weight decay, and “optimal brain surgeon/optimal brain damage.”


2020 ◽  
pp. 105971231989648 ◽  
Author(s):  
David Windridge ◽  
Henrik Svensson ◽  
Serge Thill

We consider the benefits of dream mechanisms – that is, the ability to simulate new experiences based on past ones – in a machine learning context. Specifically, we are interested in learning for artificial agents that act in the world, and operationalize “dreaming” as a mechanism by which such an agent can use its own model of the learning environment to generate new hypotheses and training data. We first show that it is not necessarily a given that such a data-hallucination process is useful, since it can easily lead to a training set dominated by spurious imagined data until an ill-defined convergence point is reached. We then analyse a notably successful implementation of a machine learning-based dreaming mechanism by Ha and Schmidhuber (Ha, D., & Schmidhuber, J. (2018). World models. arXiv e-prints, arXiv:1803.10122). On that basis, we then develop a general framework by which an agent can generate simulated data to learn from in a manner that is beneficial to the agent. This, we argue, then forms a general method for an operationalized dream-like mechanism. We finish by demonstrating the general conditions under which such mechanisms can be useful in machine learning, wherein the implicit simulator inference and extrapolation involved in dreaming act without reinforcing inference error even when inference is incomplete.


2000 ◽  
Vol 12 (4) ◽  
pp. 811-829 ◽  
Author(s):  
Eric Hartman

Inaccurate input-output gains (partial derivatives of outputs with respect to inputs) are common in neural network models when input variables are correlated or when data are incomplete or inaccurate. Accurate gains are essential for optimization, control, and other purposes. We develop and explore a method for training feedforward neural networks subject to inequality or equality-bound constraints on the gains of the learned mapping. Gain constraints are implemented as penalty terms added to the objective function, and training is done using gradient descent. Adaptive and robust procedures are devised for balancing the relative strengths of the various terms in the objective function, which is essential when the constraints are inconsistent with the data. The approach has the virtue that the model domain of validity can be extended via extrapolation training, which can dramatically improve generalization. The algorithm is demonstrated here on artificial and real-world problems with very good results and has been advantageously applied to dozens of models currently in commercial use.


MAUSAM ◽  
2022 ◽  
Vol 53 (2) ◽  
pp. 225-232
Author(s):  
PANKAJ JAIN ◽  
ASHOK KUMAR ◽  
PARVINDER MAINI ◽  
S. V. SINGH

Feedforward Neural Networks are used for daily precipitation forecast using several test stations all over India. The six year European Centre of Medium Range Weather Forecasting (ECMWF) data is used with the training set consisting of the four year data from 1985-1988 and validation set consisting of the data from 1989-1990. Neural networks are used to develop a concurrent relationship between precipitation and other atmospheric variables. No attempt is made to select optimal variables for this study and the inputs are chosen to be same as the ones obtained earlier at National Center for Medium Range Weather Forecasting (NCMRWF) in developing a linear regression model. Neural networks are found to yield results which are atleast as good as linear regression and in several cases yield 10 - 20 % improvement. This is encouraging since the variable selection has so far been optimized for linear regression.


2017 ◽  
Vol 10 (1) ◽  
pp. 01-10
Author(s):  
Kostantin Nikolic

This paper presents the application of stochastic search algorithms to train artificial neural networks. Methodology approaches in the work created primarily to provide training complex recurrent neural networks. It is known that training recurrent networks is more complex than the type of training feedforward neural networks. Through simulation of recurrent networks is realized propagation signal from input to output and training process achieves a stochastic search in the space of parameters. The performance of this type of algorithm is superior to most of the training algorithms, which are based on the concept of gradient. The efficiency of these algorithms is demonstrated in the training network created from units that are characterized by long term and long shot term memory of networks. The presented methology is effective and relative simple.


2020 ◽  
Vol 9 ◽  
pp. 266
Author(s):  
E. Mavrommatis ◽  
S. Athanassopoulos ◽  
A. Dakos ◽  
K. A. Gernoth ◽  
J. W. Clark

Multilayer feedforward neural networks are used to create global models of atomic masses and lifetimes of nuclear states, with the goal of effective prediction of the properties of nuclides outside the region of stability. Innovations in coding and training schemes are used to improve the extrapolation capability of models of the mass table. Studies of nuclear lifetimes have focused on ground states that decay 100% via the β- mode. Results are described which demonstrate that in predictive acuity, statistical approaches to global modeling based on neural networks are potentially competitive with the best phenomenological models based on the traditional methods of theoretical physics.


Sign in / Sign up

Export Citation Format

Share Document