Cost Functions for Two-Layer Neural Networks

1995 ◽  
pp. 167-176
Author(s):  
Anne-Johan Annema
2019 ◽  
Author(s):  
Takuya Isomura ◽  
Karl Friston

AbstractThis work considers a class of biologically plausible cost functions for neural networks, where the same cost function is minimised by both neural activity and plasticity. We show that such cost functions can be cast as a variational bound on model evidence under an implicit generative model. Using generative models based on Markov decision processes (MDP), we show, analytically, that neural activity and plasticity perform Bayesian inference and learning, respectively, by maximising model evidence. Using mathematical and numerical analyses, we then confirm that biologically plausible cost functions—used in neural networks—correspond to variational free energy under some prior beliefs about the prevalence of latent states that generate inputs. These prior beliefs are determined by particular constants (i.e., thresholds) that define the cost function. This means that the Bayes optimal encoding of latent or hidden states is achieved when, and only when, the network’s implicit priors match the process that generates the inputs. Our results suggest that when a neural network minimises its cost function, it is implicitly minimising variational free energy under optimal or sub-optimal prior beliefs. This insight is potentially important because it suggests that any free parameter of a neural network’s cost function can itself be optimised—by minimisation with respect to variational free energy.


2021 ◽  
Author(s):  
◽  
M. F. Bouzon

Artificial Neural Networks are a popular machine learning and artificial intelligence technique, proposed since the 1950s. Among their greatest challenges is the training of parameters such as weights, parameters of the activation functions and constants, as well as their yperparameters, such as network architecture and density of neurons per layer. Among the best known algorithms for parametric optimization of networks are Adam and BP, applied mainly in popular architectures such as MLP, RNN, LSTM, Feed-forward Neural Network (FNN), RBFNN, among many others. Recently, the great success of deep neural networks, known as Deep Learnings, as well as fully connected networks, has faced problems with training time and the use of specialized hardware. These challenges gave new impetus to the use of optimization algorithms for the training of these networks, and more recently to the algorithms inspired by nature, also called as NI. This strategy, although not a recent technique, has not yet received much attention from researchers, requiring today a greater number of experimental tests and evaluation, mainly due to the recent appearance of a much larger range of algorithms NI. Some of the elements that need attention, especially for the most recent NI, are mainly related to the time of convergence and studies on the use of different cost functions. Thus, the present master’s dissertation aims to perform tests, comparisons, and studies on algorithms NI applied to the training of neural networks. Both traditional and recent NI algorithms were tested, from many perspectives, including convergence time and cost functions, elements that until now have received little attention from researchers in previous tests. The results showed that the use of NI algorithms for the training of traditional RNAs obtained results with good classification, similar to popular algorithms such as Adam and BPMA, but surpassing these algorithms in terms of convergence time in 20 up to 70%, depending on the network and the parameters involved. This indicates that the strategy of using NI algorithms, especially the most recent ones, for training neural networks is a promising method that can impact the time and quality of the results of recent and future machine learning applications and artificial intelligence


2017 ◽  
Author(s):  
H. Steven Scholte ◽  
Max M. Losch ◽  
Kandan Ramakrishnan ◽  
Edward H.F. de Haan ◽  
Sander M. Bohte

AbstractVision research has been shaped by the seminal insight that we can understand the higher-tier visual cortex from the perspective of multiple functional pathways with different goals. In this paper, we try to give a computational account of the functional organization of this system by reasoning from the perspective of multi-task deep neural networks. Machine learning has shown that tasks become easier to solve when they are decomposed into subtasks with their own cost function. We hypothesize that the visual system optimizes multiple cost functions of unrelated tasks and this causes the emergence of a ventral pathway dedicated to vision for perception, and a dorsal pathway dedicated to vision for action. To evaluate the functional organization in multi-task deep neural networks, we propose a method that measures the contribution of a unit towards each task, applying it to two networks that have been trained on either two related or two unrelated tasks, using an identical stimulus set. Results show that the network trained on the unrelated tasks shows a decreasing degree of feature representation sharing towards higher-tier layers while the network trained on related tasks uniformly shows high degree of sharing. We conjecture that the method we propose can be used to analyze the anatomical and functional organization of the visual system and beyond. We predict that the degree to which tasks are related is a good descriptor of the degree to which they share downstream cortical-units.


Quantum ◽  
2021 ◽  
Vol 5 ◽  
pp. 558
Author(s):  
Andrew Arrasmith ◽  
M. Cerezo ◽  
Piotr Czarnik ◽  
Lukasz Cincio ◽  
Patrick J. Coles

Barren plateau landscapes correspond to gradients that vanish exponentially in the number of qubits. Such landscapes have been demonstrated for variational quantum algorithms and quantum neural networks with either deep circuits or global cost functions. For obvious reasons, it is expected that gradient-based optimizers will be significantly affected by barren plateaus. However, whether or not gradient-free optimizers are impacted is a topic of debate, with some arguing that gradient-free approaches are unaffected by barren plateaus. Here we show that, indeed, gradient-free optimizers do not solve the barren plateau problem. Our main result proves that cost function differences, which are the basis for making decisions in a gradient-free optimization, are exponentially suppressed in a barren plateau. Hence, without exponential precision, gradient-free optimizers will not make progress in the optimization. We numerically confirm this by training in a barren plateau with several gradient-free optimizers (Nelder-Mead, Powell, and COBYLA algorithms), and show that the numbers of shots required in the optimization grows exponentially with the number of qubits.


Cortex ◽  
2018 ◽  
Vol 98 ◽  
pp. 249-261 ◽  
Author(s):  
H. Steven Scholte ◽  
Max M. Losch ◽  
Kandan Ramakrishnan ◽  
Edward H.F. de Haan ◽  
Sander M. Bohte

Sign in / Sign up

Export Citation Format

Share Document