Action-inhibition hierarchies: Using a simple gastropod model to investigate serotonergic and dopaminergic control of action selection and reinforcement learning

2011 ◽  
Vol 26 (S2) ◽  
pp. 905-905
Author(s):  
S. Hodgkinson ◽  
J. Steyer ◽  
M. Jandl ◽  
W.P. Kaschka ◽  

IntroductionBasal ganglia (BG) activity plays an important role in action selection and reinforcement learning. Inputs from and to other areas of the brain are modulated by a number of neurotransmitter pathways in the BG. Disturbances in the normal function of the BG may play a role in the aetiology of psychiatric disorders such as schizophrenia and bipolar disorder.AimsDevelop a simple animal model to evaluate interactions between glutamatergic, dopaminergic, serotonergic and GABAergic neurones in the modulation of action selection and reinforcement learning.ObjectivesTo characterise the effects of changing dopaminergic and serotonergic activity on action selection and reinforcement learning in an animal model.MethodsThe food seeking / consummation (FSC) activity of the gastropod Planorbis corneus was suppressed by operant conditioning using a repeated unconditioned stimulus-punishment regime. The effects of elevated serotonin or dopamine levels (administration into cerebral, pedal and buccal ganglia), on operantly-conditioned FSC activity was assessed.ResultsOperantly-conditioned behaviour was reversed by elevated ganglia serotonin levels but snails showed no food consummation motor activity in the absence of food. In contrast, elevated ganglia dopamine levels in conditioned snails elicited food consummation motor movements in the absence of food but not orientation towards a food source.ConclusionsThe modulation of FSC activity elicited by reinforcement learning is subject to hierarchical control in gastropods. Serotoninergic activity is responsible establishing the general activity level whilst dopaminergic activity appears to play a more localised and subordinate ‘command’ role.

1965 ◽  
Vol 16 (3) ◽  
pp. 693-696 ◽  
Author(s):  
L. Glenn Collins

In two experiments involving 40 albino rats and two dosage levels of morphine sulfate it was found that relatively high analgesic dosages of morphine significantly depressed general activity level in the revolving drum. Also, there was a significant interaction between drug effect and hunger drive. In the case of moderate analgesic doses (7 mg/kg) no systematic effect of morphine on activity-wheel performance was noted.


Author(s):  
Daxue Liu ◽  
Jun Wu ◽  
Xin Xu

Multi-agent reinforcement learning (MARL) provides a useful and flexible framework for multi-agent coordination in uncertain dynamic environments. However, the generalization ability and scalability of algorithms to large problem sizes, already problematic in single-agent RL, is an even more formidable obstacle in MARL applications. In this paper, a new MARL method based on ordinal action selection and approximate policy iteration called OAPI (Ordinal Approximate Policy Iteration), is presented to address the scalability issue of MARL algorithms in common-interest Markov Games. In OAPI, an ordinal action selection and learning strategy is integrated with distributed approximate policy iteration not only to simplify the policy space and eliminate the conflicts in multi-agent coordination, but also to realize the approximation of near-optimal policies for Markov Games with large state spaces. Based on the simplified policy space using ordinal action selection, the OAPI algorithm implements distributed approximate policy iteration utilizing online least-squares policy iteration (LSPI). This resulted in multi-agent coordination with good convergence properties with reduced computational complexity. The simulation results of a coordinated multi-robot navigation task illustrate the feasibility and effectiveness of the proposed approach.


2020 ◽  
Vol 30 (6) ◽  
pp. 3573-3589 ◽  
Author(s):  
Rick A Adams ◽  
Michael Moutoussis ◽  
Matthew M Nour ◽  
Tarik Dahoun ◽  
Declan Lewis ◽  
...  

Abstract Choosing actions that result in advantageous outcomes is a fundamental function of nervous systems. All computational decision-making models contain a mechanism that controls the variability of (or confidence in) action selection, but its neural implementation is unclear—especially in humans. We investigated this mechanism using two influential decision-making frameworks: active inference (AI) and reinforcement learning (RL). In AI, the precision (inverse variance) of beliefs about policies controls action selection variability—similar to decision ‘noise’ parameters in RL—and is thought to be encoded by striatal dopamine signaling. We tested this hypothesis by administering a ‘go/no-go’ task to 75 healthy participants, and measuring striatal dopamine 2/3 receptor (D2/3R) availability in a subset (n = 25) using [11C]-(+)-PHNO positron emission tomography. In behavioral model comparison, RL performed best across the whole group but AI performed best in participants performing above chance levels. Limbic striatal D2/3R availability had linear relationships with AI policy precision (P = 0.029) as well as with RL irreducible decision ‘noise’ (P = 0.020), and this relationship with D2/3R availability was confirmed with a ‘decision stochasticity’ factor that aggregated across both models (P = 0.0006). These findings are consistent with occupancy of inhibitory striatal D2/3Rs decreasing the variability of action selection in humans.


2011 ◽  
Vol E94-D (2) ◽  
pp. 255-263 ◽  
Author(s):  
Ali AKRAMIZADEH ◽  
Ahmad AFSHAR ◽  
Mohammad Bagher MENHAJ ◽  
Samira JAFARI

2006 ◽  
Vol 04 (06) ◽  
pp. 1071-1083 ◽  
Author(s):  
C. L. CHEN ◽  
D. Y. DONG ◽  
Z. H. CHEN

This paper proposes a novel action selection method based on quantum computation and reinforcement learning (RL). Inspired by the advantages of quantum computation, the state/action in a RL system is represented with quantum superposition state. The probability of action eigenvalue is denoted by probability amplitude, which is updated according to rewards. And the action selection is carried out by observing quantum state according to collapse postulate of quantum measurement. The results of simulated experiments show that quantum computation can be effectively used to action selection and decision making through speeding up learning. This method also makes a good tradeoff between exploration and exploitation for RL using probability characteristics of quantum theory.


Sign in / Sign up

Export Citation Format

Share Document