The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks

A general method for building and training multilayer perceptrons composed of linear threshold units is proposed. A simple recursive rule is used to build the structure of the network by adding units as they are needed, while a modified perceptron algorithm is used to learn the connection strengths. Convergence to zero errors is guaranteed for any boolean classification on patterns of binary variables. Simulations suggest that this method is efficient in terms of the numbers of units constructed, and the networks it builds can generalize over patterns not in the training set.

Download Full-text

Reverse engineering imperceptible backdoor attacks on deep neural networks for detection and training set cleansing

Computers & Security ◽

10.1016/j.cose.2021.102280 ◽

2021 ◽

Vol 106 ◽

pp. 102280

Author(s):

Zhen Xiang ◽

David J. Miller ◽

George Kesidis

Keyword(s):

Neural Networks ◽

Reverse Engineering ◽

Deep Neural Networks ◽

Training Set ◽

And Training

Download Full-text

Optical Recognition of Handwritten Logic Formulas Using Neural Networks

Electronics ◽

10.3390/electronics10222761 ◽

2021 ◽

Vol 10 (22) ◽

pp. 2761

Author(s):

Vaios Ampelakiotis ◽

Isidoros Perikos ◽

Ioannis Hatzilygeroudis ◽

George Tsihrintzis

Keyword(s):

Neural Networks ◽

Character Recognition ◽

Gradient Descent ◽

Feedforward Neural Networks ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Training Algorithms ◽

Gradient Descent Algorithm ◽

Two Stages ◽

And Training

In this paper, we present a handwritten character recognition (HCR) system that aims to recognize first-order logic handwritten formulas and create editable text files of the recognized formulas. Dense feedforward neural networks (NNs) are utilized, and their performance is examined under various training conditions and methods. More specifically, after three training algorithms (backpropagation, resilient propagation and stochastic gradient descent) had been tested, we created and trained an NN with the stochastic gradient descent algorithm, optimized by the Adam update rule, which was proved to be the best, using a trainset of 16,750 handwritten image samples of 28 × 28 each and a testset of 7947 samples. The final accuracy achieved is 90.13%. The general methodology followed consists of two stages: the image processing and the NN design and training. Finally, an application has been created that implements the methodology and automatically recognizes handwritten logic formulas. An interesting feature of the application is that it allows for creating new, user-oriented training sets and parameter settings, and thus new NN models.

Download Full-text

Flat Minima

Neural Computation ◽

10.1162/neco.1997.9.1.1 ◽

1997 ◽

Vol 9 (1) ◽

pp. 1-42 ◽

Cited By ~ 156

Author(s):

Sepp Hochreiter ◽

Jürgen Schmidhuber

Keyword(s):

Neural Networks ◽

Error Function ◽

Low Complexity ◽

Generalization Error ◽

Input Output ◽

Generalization Capability ◽

Training Set ◽

Weight Decay ◽

Optimal Brain Surgeon ◽

And Training

We present a new algorithm for finding low-complexity neural networks with high generalization capability. The algorithm searches for a “flat” minimum of the error function. A flat minimum is a large connected region in weight space where the error remains approximately constant. An MDL-based, Bayesian argument suggests that flat minima correspond to “simple” networks and low expected overfitting. The argument is based on a Gibbs algorithm variant and a novel way of splitting generalization error into underfitting and overfitting error. Unlike many previous approaches, ours does not require gaussian assumptions and does not depend on a “good” weight prior. Instead we have a prior over input output functions, thus taking into account net architecture and training set. Although our algorithm requires the computation of second-order derivatives, it has backpropagation's order of complexity. Automatically, it effectively prunes units, weights, and input lines. Various experiments with feedforward and recurrent nets are described. In an application to stock market prediction, flat minimum search outperforms conventional backprop, weight decay, and “optimal brain surgeon/optimal brain damage.”

Download Full-text

On the utility of dreaming: A general model for how learning in artificial agents can benefit from data hallucination

Adaptive Behavior ◽

10.1177/1059712319896489 ◽

2020 ◽

pp. 105971231989648 ◽

Cited By ~ 2

Author(s):

David Windridge ◽

Henrik Svensson ◽

Serge Thill

Keyword(s):

Machine Learning ◽

Simulated Data ◽

Training Data ◽

Successful Implementation ◽

Artificial Agents ◽

Learning Context ◽

Training Set ◽

Convergence Point ◽

And Training ◽

General Method

We consider the benefits of dream mechanisms – that is, the ability to simulate new experiences based on past ones – in a machine learning context. Specifically, we are interested in learning for artificial agents that act in the world, and operationalize “dreaming” as a mechanism by which such an agent can use its own model of the learning environment to generate new hypotheses and training data. We first show that it is not necessarily a given that such a data-hallucination process is useful, since it can easily lead to a training set dominated by spurious imagined data until an ill-defined convergence point is reached. We then analyse a notably successful implementation of a machine learning-based dreaming mechanism by Ha and Schmidhuber (Ha, D., & Schmidhuber, J. (2018). World models. arXiv e-prints, arXiv:1803.10122). On that basis, we then develop a general framework by which an agent can generate simulated data to learn from in a manner that is beneficial to the agent. This, we argue, then forms a general method for an operationalized dream-like mechanism. We finish by demonstrating the general conditions under which such mechanisms can be useful in machine learning, wherein the implicit simulator inference and extrapolation involved in dreaming act without reinforcing inference error even when inference is incomplete.

Download Full-text

DESIGN AND TRAINING OF FEEDFORWARD NEURAL NETWORKS FOR MOTOR FAULT DETECTION

Methodologies of Using Neural Network and Fuzzy Logic Technologies for Motor Incipient Fault Detection ◽

10.1142/9789812819383_0005 ◽

1997 ◽

pp. 63-79 ◽

Cited By ~ 1

Keyword(s):

Neural Networks ◽

Fault Detection ◽

Feedforward Neural Networks ◽

And Training

Download Full-text

Feedforward neural networks for Bayes-optimal classification: investigations into the influence of the composition of the training set on the cost function

Proceedings of 13th International Conference on Pattern Recognition ◽

10.1109/icpr.1996.547419 ◽

1996 ◽

Cited By ~ 1

Author(s):

A. Doering ◽

H. Witte

Keyword(s):

Neural Networks ◽

Cost Function ◽

Feedforward Neural Networks ◽

Training Set ◽

Optimal Classification ◽

The Cost

Download Full-text

Training Feedforward Neural Networks with Gain Constraints

Neural Computation ◽

10.1162/089976600300015600 ◽

2000 ◽

Vol 12 (4) ◽

pp. 811-829 ◽

Cited By ~ 9

Author(s):

Eric Hartman

Keyword(s):

Neural Networks ◽

Objective Function ◽

Network Models ◽

Feedforward Neural Networks ◽

Bound Constraints ◽

Neural Network Models ◽

Optimization Control ◽

Input Variables ◽

Derivatives Of ◽

And Training

Inaccurate input-output gains (partial derivatives of outputs with respect to inputs) are common in neural network models when input variables are correlated or when data are incomplete or inaccurate. Accurate gains are essential for optimization, control, and other purposes. We develop and explore a method for training feedforward neural networks subject to inequality or equality-bound constraints on the gains of the learned mapping. Gain constraints are implemented as penalty terms added to the objective function, and training is done using gradient descent. Adaptive and robust procedures are devised for balancing the relative strengths of the various terms in the objective function, which is essential when the constraints are inconsistent with the data. The approach has the virtue that the model domain of validity can be extended via extrapolation training, which can dramatically improve generalization. The algorithm is demonstrated here on artificial and real-world problems with very good results and has been advantageously applied to dozens of models currently in commercial use.

Download Full-text

Short range SW monsoon rainfall forecasting over India using neural networks

MAUSAM ◽

10.54302/mausam.v53i2.1637 ◽

2022 ◽

Vol 53 (2) ◽

pp. 225-232

Author(s):

PANKAJ JAIN ◽

ASHOK KUMAR ◽

PARVINDER MAINI ◽

S. V. SINGH

Keyword(s):

Neural Networks ◽

Linear Regression ◽

Monsoon Rainfall ◽

Weather Forecasting ◽

Feedforward Neural Networks ◽

Precipitation Forecast ◽

Rainfall Forecasting ◽

Training Set ◽

Medium Range ◽

Validation Set

Feedforward Neural Networks are used for daily precipitation forecast using several test stations all over India. The six year European Centre of Medium Range Weather Forecasting (ECMWF) data is used with the training set consisting of the four year data from 1985-1988 and validation set consisting of the data from 1989-1990. Neural networks are used to develop a concurrent relationship between precipitation and other atmospheric variables. No attempt is made to select optimal variables for this study and the inputs are chosen to be same as the ones obtained earlier at National Center for Medium Range Weather Forecasting (NCMRWF) in developing a linear regression model. Neural networks are found to yield results which are atleast as good as linear regression and in several cases yield 10 - 20 % improvement. This is encouraging since the variable selection has so far been optimized for linear regression.

Download Full-text

Training Neural Network Elements Created From Long Shot Term Memory

Oriental journal of computer science and technology ◽

10.13005/ojcst/10.01.01 ◽

2017 ◽

Vol 10 (1) ◽

pp. 01-10

Author(s):

Kostantin Nikolic

Keyword(s):

Neural Networks ◽

Feedforward Neural Networks ◽

Stochastic Search ◽

Training Algorithms ◽

Recurrent Networks ◽

Training Process ◽

Term Memory ◽

And Training ◽

Training Neural Network

This paper presents the application of stochastic search algorithms to train artificial neural networks. Methodology approaches in the work created primarily to provide training complex recurrent neural networks. It is known that training recurrent networks is more complex than the type of training feedforward neural networks. Through simulation of recurrent networks is realized propagation signal from input to output and training process achieves a stochastic search in the space of parameters. The performance of this type of algorithm is superior to most of the training algorithms, which are based on the concept of gradient. The efficiency of these algorithms is demonstrated in the training network created from units that are characterized by long term and long shot term memory of networks. The presented methology is effective and relative simple.

Download Full-text

Statistical modeling with neural nets: nuclear masses and halflives

HNPS Proceedings ◽

10.12681/hnps.2787 ◽

2020 ◽

Vol 9 ◽

pp. 266

Author(s):

E. Mavrommatis ◽

S. Athanassopoulos ◽

A. Dakos ◽

K. A. Gernoth ◽

J. W. Clark

Keyword(s):

Neural Networks ◽

Feedforward Neural Networks ◽

Neural Nets ◽

Theoretical Physics ◽

Global Modeling ◽

Nuclear Masses ◽

Region Of Stability ◽

Statistical Approaches ◽

Training Schemes ◽

And Training

Multilayer feedforward neural networks are used to create global models of atomic masses and lifetimes of nuclear states, with the goal of effective prediction of the properties of nuclides outside the region of stability. Innovations in coding and training schemes are used to improve the extrapolation capability of models of the mass table. Studies of nuclear lifetimes have focused on ground states that decay 100% via the β- mode. Results are described which demonstrate that in predictive acuity, statistical approaches to global modeling based on neural networks are potentially competitive with the best phenomenological models based on the traditional methods of theoretical physics.

Download Full-text