Reward-Modulated Hebbian Learning of Decision Making

2010 ◽  
Vol 22 (6) ◽  
pp. 1399-1444 ◽  
Author(s):  
Michael Pfeiffer ◽  
Bernhard Nessler ◽  
Rodney J. Douglas ◽  
Wolfgang Maass

We introduce a framework for decision making in which the learning of decision making is reduced to its simplest and biologically most plausible form: Hebbian learning on a linear neuron. We cast our Bayesian-Hebb learning rule as reinforcement learning in which certain decisions are rewarded and prove that each synaptic weight will on average converge exponentially fast to the log-odd of receiving a reward when its pre- and postsynaptic neurons are active. In our simple architecture, a particular action is selected from the set of candidate actions by a winner-take-all operation. The global reward assigned to this action then modulates the update of each synapse. Apart from this global reward signal, our reward-modulated Bayesian Hebb rule is a pure Hebb update that depends only on the coactivation of the pre- and postsynaptic neurons, not on the weighted sum of all presynaptic inputs to the postsynaptic neuron as in the perceptron learning rule or the Rescorla-Wagner rule. This simple approach to action-selection learning requires that information about sensory inputs be presented to the Bayesian decision stage in a suitably preprocessed form resulting from other adaptive processes (acting on a larger timescale) that detect salient dependencies among input features. Hence our proposed framework for fast learning of decisions also provides interesting new hypotheses regarding neural nodes and computational goals of cortical areas that provide input to the final decision stage.

2021 ◽  
pp. 1-33
Author(s):  
Kevin Berlemont ◽  
Jean-Pierre Nadal

Abstract In experiments on perceptual decision making, individuals learn a categorization task through trial-and-error protocols. We explore the capacity of a decision-making attractor network to learn a categorization task through reward-based, Hebbian-type modifications of the weights incoming from the stimulus encoding layer. For the latter, we assume a standard layer of a large number of stimu lus-specific neurons. Within the general framework of Hebbian learning, we have hypothesized that the learning rate is modulated by the reward at each trial. Surprisingly, we find that when the coding layer has been optimized in view of the categorization task, such reward-modulated Hebbian learning (RMHL) fails to extract efficiently the category membership. In previous work, we showed that the attractor neural networks' nonlinear dynamics accounts for behavioral confidence in sequences of decision trials. Taking advantage of these findings, we propose that learning is controlled by confidence, as computed from the neural activity of the decision-making attractor network. Here we show that this confidence-controlled, reward-based Hebbian learning efficiently extracts categorical information from the optimized coding layer. The proposed learning rule is local and, in contrast to RMHL, does not require storing the average rewards obtained on previous trials. In addition, we find that the confidence-controlled learning rule achieves near-optimal performance. In accordance with this result, we show that the learning rule approximates a gradient descent method on a maximizing reward cost function.


2021 ◽  
Author(s):  
Siwei Qiu

AbstractPrimates and rodents are able to continually acquire, adapt, and transfer knowledge and skill, and lead to goal-directed behavior during their lifespan. For the case when context switches slowly, animals learn via slow processes. For the case when context switches rapidly, animals learn via fast processes. We build a biologically realistic model with modules similar to a distributed computing system. Specifically, we are emphasizing the role of thalamocortical learning on a slow time scale between the prefrontal cortex (PFC) and medial dorsal thalamus (MD). Previous work [1] has already shown experimental evidence supporting classification of cell ensembles in the medial dorsal thalamus, where each class encodes a different context. However, the mechanism by which such classification is learned is not clear. In this work, we show that such learning can be self-organizing in the manner of an automaton (a distributed computing system), via a combination of Hebbian learning and homeostatic synaptic scaling. We show that in the simple case of two contexts, the network with hierarchical structure can do context-based decision making and smooth switching between different contexts. Our learning rule creates synaptic competition [2] between the thalamic cells to create winner-take-all activity. Our theory shows that the capacity of such a learning process depends on the total number of task-related hidden variables, and such a capacity is limited by system size N. We also theoretically derived the effective functional connectivity as a function of an order parameter dependent on the thalamo-cortical coupling structure.Significance StatementAnimals need to adapt to dynamically changing environments and make decisions based on changing contexts. Here we propose a combination of neural circuit structure with learning mechanisms to account for such behaviors. Specifically, we built a reservoir computing network improved by a Hebbian learning rule together with a synaptic scaling learning mechanism between the prefrontal cortex and the medial-dorsal (MD) thalamus. This model shows that MD thalamus is crucial in such context-based decision making. I also make use of dynamical mean field theory to predict the effective neural circuit. Furthermore, theoretical analysis provides a prediction that the capacity of such a network increases with the network size and the total number of tasks-related latent variables.


2002 ◽  
Vol 12 (02) ◽  
pp. 83-93 ◽  
Author(s):  
BURKHARD LENZE ◽  
JÖRG RADDATZ

In this paper, we will take a further look at a generalized perceptron-like learning rule which uses dilation and translation parameters in order to enhance the recall performance of higher order Hopfield neural networks without significantly increasing their complexity. We will practically study the influence of these parameters on the perceptron learning and recall process, using a generalized version of the Hebbian learning rule for initialization. Our analysis will be based on a pattern recognition problem with random patterns. We will see that in case of a highly correlated set of patterns, there can be gained some improvements concerning the learning and recall performance. On the other hand, we will show that the dilation and translation parameters have to be chosen carefully for a positive result.


2020 ◽  
Author(s):  
Kevin Berlemont ◽  
Jean-Pierre Nadal

AbstractIn experiments on perceptual decision-making, individuals learn a categorization task through trial-and-error protocols. We explore the capacity of a decision-making attractor network to learn a categorization task through reward-based, Hebbian type, modifications of the weights incoming from the stimulus encoding layer. For the latter, we assume a standard layer of a large number of stimulus specific neurons. Within the general framework of Hebbian learning, authors have hypothesized that the learning rate is modulated by the reward at each trial. Surprisingly, we find that, when the coding layer has been optimized in view of the categorization task, such reward-modulated Hebbian learning (RMHL) fails to extract efficiently the category membership. In a previous work we showed that the attractor neural networks nonlinear dynamics accounts for behavioral confidence in sequences of decision trials. Taking advantage of these findings, we propose that learning is controlled by confidence, as computed from the neural activity of the decision-making attractor network. Here we show that this confidence-controlled, reward-based, Hebbian learning efficiently extracts categorical information from the optimized coding layer. The proposed learning rule is local, and, in contrast to RMHL, does not require to store the average rewards obtained on previous trials. In addition, we find that the confidence-controlled learning rule achieves near optimal performance.


2018 ◽  
Vol 41 ◽  
Author(s):  
David Danks

AbstractThe target article uses a mathematical framework derived from Bayesian decision making to demonstrate suboptimal decision making but then attributes psychological reality to the framework components. Rahnev & Denison's (R&D) positive proposal thus risks ignoring plausible psychological theories that could implement complex perceptual decision making. We must be careful not to slide from success with an analytical tool to the reality of the tool components.


2019 ◽  
Author(s):  
Tayana Soukup ◽  
Ged Murtagh ◽  
Ben W Lamb ◽  
James Green ◽  
Nick Sevdalis

Background Multidisciplinary teams (MDTs) are a standard cancer care policy in many countries worldwide. Despite an increase in research in a recent decade on MDTs and their care planning meetings, the implementation of MDT-driven decision-making (fidelity) remains unstudied. We report a feasibility evaluation of a novel method for assessing cancer MDT decision-making fidelity. We used an observational protocol to assess (1) the degree to which MDTs adhere to the stages of group decision-making as per the ‘Orientation-Discussion-Decision-Implementation’ framework, and (2) the degree of multidisciplinarity underpinning individual case reviews in the meetings. MethodsThis is a prospective observational study. Breast, colorectal and gynaecological cancer MDTs in the Greater London and Derbyshire (United Kingdom) areas were video recorded over 12-weekly meetings encompassing 822 case reviews. Data were coded and analysed using frequency counts.Results Eight interaction formats during case reviews were identified. case reviews were not always multi-disciplinary: only 8% of overall reviews involved all five clinical disciplines present, and 38% included four of five. The majority of case reviews (i.e. 54%) took place between two (25%) or three (29%) disciplines only. Surgeons (83%) and oncologists (8%) most consistently engaged in all stages of decision-making. While all patients put forward for MDT review were actually reviewed, a small percentage of them (4%) either bypassed the orientation (case presentation) and went straight into discussing the patient, or they did not articulate the final decision to the entire team (8%). Conclusions Assessing fidelity of MDT decision-making at the point of their weekly meetings is feasible. We found that despite being a set policy, case reviews are not entirely MDT-driven. We discuss implications in relation to the current eco-political climate, and the quality and safety of care. Our findings are in line with the current national initiatives in the UK on streamlining MDT meetings, and could help decide how to re-organise them to be most efficient.


2009 ◽  
Vol 20 (9) ◽  
pp. 2574-2586 ◽  
Author(s):  
Yu-Xing SUN ◽  
Song-Hua HUANG ◽  
Li-Jun CHEN ◽  
Li XIE

2005 ◽  
Vol 165 (3) ◽  
pp. 403
Author(s):  
Uehara ◽  
Yokomizo ◽  
Iwasa

2020 ◽  
Vol 32 (2) ◽  
pp. 159-184 ◽  
Author(s):  
Satoko Fujiwara ◽  
Tim Jensen

Abstract Donald Wiebe claims that the IAHR leadership (already before an Extended Executive Committee (EEC) meeting in Delphi) had decided to water down the academic standards of the IAHR with a proposal to change its name to “International Association for the Study of Religions.” His criticism, we argue, is based on a series of misunderstandings as regards: 1) the difference between the consultative body (EEC) and the decision-making body (EC), 2) the difference between the preliminary points of view of individuals and final proposals by the EC, 3) personal conversations, 4) the link between the proposal to change the name and the wish to tighten up the academic profile of the IAHR. Moreover, if the final decision-making bodies, the International Committee and the General Assembly, adopt the proposal, the new name as little as the old can make the IAHR more or less scientific. Tightening up the academic, scientific profile of the IAHR takes more than a change of name.


Author(s):  
Michael de Oliveira ◽  
Luis Soares Barbosa

Sign in / Sign up

Export Citation Format

Share Document