Two adaptive stepsize rules for gradient descent and their application to the training of feedforward artificial neural networks

AbstractComplex biological systems in nature comprise cells that act collectively to solve sophisticated tasks. Synthetic biological systems, in contrast, are designed for specific tasks, following computational principles including logic gates and analog design. Yet such approaches cannot be easily adapted for multiple tasks in biological contexts. Alternatively, artificial neural networks, comprised of flexible interactions for computation, support adaptive designs and are adopted for diverse applications. Here, motivated by the structural similarity between artificial neural networks and cellular networks, we implement neural-like computing in bacteria consortia for recognizing patterns. Specifically, receiver bacteria collectively interact with sender bacteria for decision-making through quorum sensing. Input patterns formed by chemical inducers activate senders to produce signaling molecules at varying levels. These levels, which act as weights, are programmed by tuning the sender promoter strength Furthermore, a gradient descent based algorithm that enables weights optimization was developed. Weights were experimentally examined for recognizing 3 × 3-bit pattern.

Download Full-text

Global descent replaces gradient descent to avoid local minima problem in learning with artificial neural networks

IEEE International Conference on Neural Networks ◽

10.1109/icnn.1993.298667 ◽

2002 ◽

Cited By ~ 22

Author(s):

B.C. Cetin ◽

J.W. Burdick ◽

J. Barhen

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Gradient Descent ◽

Local Minima ◽

Artificial Neural

Download Full-text

Deep Convolutional Spiking Neural Networks for Image Classification

10.18122/td.1782.boisestate ◽

2021 ◽

Author(s):

Ruthvik Vaila

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Gradient Descent ◽

Stochastic Gradient ◽

Spiking Neural Networks ◽

Stochastic Gradient Descent ◽

Data Set ◽

Learning Capabilities ◽

Artificial Neural

Spiking neural networks are biologically plausible counterparts of artificial neural networks. Artificial neural networks are usually trained with stochastic gradient descent (SGD) and spiking neural networks are trained with bioinspired spike timing dependent plasticity (STDP). Spiking networks could potentially help in reducing power usage owing to their binary activations. In this work, we use unsupervised STDP in the feature extraction layers of a neural network with instantaneous neurons to extract meaningful features. The extracted binary feature vectors are then classified using classification layers containing neurons with binary activations. Gradient descent (backpropagation) is used only on the output layer to perform training for classification. Surrogate gradients are proposed to perform backpropagation with binary gradients. The accuracies obtained for MNIST and the balanced EMNIST data set compare favorably with other approaches. The effect of the stochastic gradient descent (SGD) approximations on learning capabilities of our network are also explored. We also studied catastrophic forgetting and its effect on spiking neural networks (SNNs). For the experiments regarding catastrophic forgetting, in the classification sections of the network we use a modified synaptic intelligence that we refer to as cost per synapse metric as a regularizer to immunize the network against catastrophic forgetting in a Single-Incremental-Task scenario (SIT). In catastrophic forgetting experiments, we use MNIST and EMNIST handwritten digits datasets that were divided into five and ten incremental subtasks respectively. We also examine behavior of the spiking neural network and empirically study the effect of various hyperparameters on its learning capabilities using the software tool SPYKEFLOW that we developed. We employ MNIST, EMNIST and NMNIST data sets to produce our results.

Download Full-text

Performance comparison of gradient descent and Genetic Algorithm based Artificial Neural Networks training

2010 10th International Conference on Intelligent Systems Design and Applications ◽

10.1109/isda.2010.5687199 ◽

2010 ◽

Cited By ~ 3

Author(s):

Fadzil Ahmad ◽

Nor Ashidi Mat Isa ◽

Muhammad Khusairi Osman ◽

Zakaria Hussain

Keyword(s):

Genetic Algorithm ◽

Neural Networks ◽

Artificial Neural Networks ◽

Gradient Descent ◽

Performance Comparison ◽

Artificial Neural

Download Full-text

Estimation of Granger causality through Artificial Neural Networks: applications to physiological systems and chaotic electronic oscillators

PeerJ Computer Science ◽

10.7717/peerj-cs.429 ◽

2021 ◽

Vol 7 ◽

pp. e429

Author(s):

Yuri Antonacci ◽

Ludovico Minati ◽

Luca Faes ◽

Riccardo Pernice ◽

Giandomenico Nollo ◽

...

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Granger Causality ◽

Gradient Descent ◽

Penalized Regression ◽

Var Model ◽

Regression Problem ◽

Artificial Neural ◽

Electronic Oscillators ◽

Physiological Systems

One of the most challenging problems in the study of complex dynamical systems is to find the statistical interdependencies among the system components. Granger causality (GC) represents one of the most employed approaches, based on modeling the system dynamics with a linear vector autoregressive (VAR) model and on evaluating the information flow between two processes in terms of prediction error variances. In its most advanced setting, GC analysis is performed through a state-space (SS) representation of the VAR model that allows to compute both conditional and unconditional forms of GC by solving only one regression problem. While this problem is typically solved through Ordinary Least Square (OLS) estimation, a viable alternative is to use Artificial Neural Networks (ANNs) implemented in a simple structure with one input and one output layer and trained in a way such that the weights matrix corresponds to the matrix of VAR parameters. In this work, we introduce an ANN combined with SS models for the computation of GC. The ANN is trained through the Stochastic Gradient Descent L1 (SGD-L1) algorithm, and a cumulative penalty inspired from penalized regression is applied to the network weights to encourage sparsity. Simulating networks of coupled Gaussian systems, we show how the combination of ANNs and SGD-L1 allows to mitigate the strong reduction in accuracy of OLS identification in settings of low ratio between number of time series points and of VAR parameters. We also report how the performances in GC estimation are influenced by the number of iterations of gradient descent and by the learning rate used for training the ANN. We recommend using some specific combinations for these parameters to optimize the performance of GC estimation. Then, the performances of ANN and OLS are compared in terms of GC magnitude and statistical significance to highlight the potential of the new approach to reconstruct causal coupling strength and network topology even in challenging conditions of data paucity. The results highlight the importance of of a proper selection of regularization parameter which determines the degree of sparsity in the estimated network. Furthermore, we apply the two approaches to real data scenarios, to study the physiological network of brain and peripheral interactions in humans under different conditions of rest and mental stress, and the effects of the newly emerged concept of remote synchronization on the information exchanged in a ring of electronic oscillators. The results highlight how ANNs provide a mesoscopic description of the information exchanged in networks of multiple interacting physiological systems, preserving the most active causal interactions between cardiovascular, respiratory and brain systems. Moreover, ANNs can reconstruct the flow of directed information in a ring of oscillators whose statistical properties can be related to those of physiological networks.

Download Full-text

Dual Gradient Descent Algorithm on Two-Layered Feed-Forward Artificial Neural Networks

New Trends in Applied Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-540-73325-6_69 ◽

2007 ◽

pp. 696-704

Author(s):

Bumghi Choi ◽

Ju-Hong Lee ◽

Tae-Su Park

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Gradient Descent ◽

Feed Forward ◽

Descent Algorithm ◽

Gradient Descent Algorithm ◽

Artificial Neural ◽

Dual Gradient

Download Full-text

A comparison of gradient ascent, gradient descent and genetic-algorithm-based artificial neural networks for the binary classification problem

Expert Systems ◽

10.1111/j.1468-0394.2007.00421.x ◽

2007 ◽

Vol 24 (2) ◽

pp. 65-86 ◽

Cited By ~ 11

Author(s):

Parag C. Pendharkar

Keyword(s):

Genetic Algorithm ◽

Neural Networks ◽

Artificial Neural Networks ◽

Gradient Descent ◽

Binary Classification ◽

Classification Problem ◽

Gradient Ascent ◽

Binary Classification Problem ◽

Artificial Neural

Download Full-text

A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions

Journal of Complexity ◽

10.1016/j.jco.2022.101646 ◽

2022 ◽

pp. 101646

Author(s):

Patrick Cheridito ◽

Arnulf Jentzen ◽

Adrian Riekert ◽

Florian Rossmannek

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Gradient Descent ◽

Artificial Neural ◽

Proof Of Convergence

Download Full-text

Forecasting Air Quality Index Using an Ensemble of Artificial Neural Networks and Regression Models

Journal of Intelligent Systems ◽

10.1515/jisys-2017-0277 ◽

2017 ◽

Vol 28 (5) ◽

pp. 893-903 ◽

Cited By ~ 1

Author(s):

S. Sankar Ganesh ◽

Pachaiyappan Arulmozhivarman ◽

Rao Tatavarti

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Air Quality ◽

Regression Models ◽

Quality Index ◽

Gradient Descent ◽

Air Quality Index ◽

Support Vector ◽

Artificial Neural

Abstract Air is the most essential constituent for the sustenance of life on earth. The air we inhale has a tremendous impact on our health and well-being. Hence, it is always advisable to monitor the quality of air in our environment. To forecast the air quality index (AQI), artificial neural networks (ANNs) trained with conjugate gradient descent (CGD), such as multilayer perceptron (MLP), cascade forward neural network, Elman neural network, radial basis function (RBF) neural network, and nonlinear autoregressive model with exogenous input (NARX) along with regression models such as multiple linear regression (MLR) consisting of batch gradient descent (BGD), stochastic gradient descent (SGD), mini-BGD (MBGD) and CGD algorithms, and support vector regression (SVR), are implemented. In these models, the AQI is the dependent variable and the concentrations of NO2, CO, O3, PM2.5, SO2, and PM10 for the years 2010–2016 in Houston and Los Angeles are the independent variables. For the final forecast, several ensemble models of individual neural network predictors and individual regression predictors are presented. This proposed approach performs with the highest efficiency in terms of forecasting air quality index.

Download Full-text