scholarly journals The control of tonic pain by active relief learning

eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Suyi Zhang ◽  
Hiroaki Mano ◽  
Michael Lee ◽  
Wako Yoshida ◽  
Mitsuo Kawato ◽  
...  

Tonic pain after injury characterises a behavioural state that prioritises recovery. Although generally suppressing cognition and attention, tonic pain needs to allow effective relief learning to reduce the cause of the pain. Here, we describe a central learning circuit that supports learning of relief and concurrently suppresses the level of ongoing pain. We used computational modelling of behavioural, physiological and neuroimaging data in two experiments in which subjects learned to terminate tonic pain in static and dynamic escape-learning paradigms. In both studies, we show that active relief-seeking involves a reinforcement learning process manifest by error signals observed in the dorsal putamen. Critically, this system uses an uncertainty (‘associability’) signal detected in pregenual anterior cingulate cortex that both controls the relief learning rate, and endogenously and parametrically modulates the level of tonic pain. The results define a self-organising learning circuit that reduces ongoing pain when learning about potential relief.

2017 ◽  
Author(s):  
Suyi Zhang ◽  
Hiroaki Mano ◽  
Michael Lee ◽  
Wako Yoshida ◽  
Mitsuo Kawato ◽  
...  

AbstractTonic pain after injury characterises a behavioural state that prioritises recovery. Although generally suppressing cognition and attention, tonic pain needs to allow effective relief learning so that the cause of the pain can be reduced if possible. Here, we describe a central learning circuit that supports learning of relief and concurrently suppresses the level of ongoing pain. We used computational modelling of behavioural, physiological and neuroimaging data in two experiments in which subjects learned to terminate tonic pain in static and dynamic escape-learning paradigms. In both studies, we show that active relief-seeking involves a reinforcement learning process manifest by error signals observed in the dorsal putamen. Critically, this system also uses an uncertainty (‘associability’) signal detected in pregenual anterior cingulate cortex that both controls the relief learning rate, and endogenously and parametrically modulates the level of tonic pain. The results define a self-organising learning circuit that allows reduction of ongoing pain when learning about potential relief.


2019 ◽  
Author(s):  
Timothy J. Meeker ◽  
Anne-Christine Schmid ◽  
Michael L. Keaser ◽  
Shariq A. Khan ◽  
Rao P. Gullapalli ◽  
...  

AbstractNeural mechanisms of ongoing nociceptive processing in the human brain remain largely obscured by the dual challenge of accessing neural dynamics and safely applying sustained painful stimuli. Recently, pain-related neural processing has been measured using fMRI resting state functional connectivity (FC) in chronic pain patients. However, ongoing pain-related processing in normally pain-free humans remains incompletely understood. Therefore, differences between chronic pain patients and controls may be due to comorbidities with chronic pain. Decreased FC among regions of the descending pain modulation network (DPMN) are associated with presence and severity of chronic pain disorders. We aimed to determine if the presence of prolonged tonic pain would lead to disruption of the DPMN. High (10%) concentration topical capsaicin was combined with a warm thermode applied to the leg to create a flexible, prolonged tonic pain model to study the FC of brain networks in otherwise healthy, pain-free subjects in two separate cohorts (n=18; n=32). We contrasted seed-based FC during prolonged tonic pain with a pain-free passive task. In seed-based FC analysis prolonged tonic pain led to enhanced FC between the anterior middle cingulate cortex (aMCC) and the somatosensory leg representation. Additionally, FC was enhanced between the pregenual anterior cingulate cortex (pACC), right mediodorsal thalamus and the posterior parietal cortex bilaterally. Further, in the seed-driven PAG network, positive FC with the left DLPFC became negative FC during prolonged tonic pain. These data suggest that some altered DPMN FC findings in chronic pain could partially be explained by the presence of ongoing pain.


2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.


2015 ◽  
Vol 25 (3) ◽  
pp. 471-482 ◽  
Author(s):  
Bartłomiej Śnieżyński

AbstractIn this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process


2021 ◽  
Vol 8 ◽  
Author(s):  
Huan Zhao ◽  
Junhua Zhao ◽  
Ting Shu ◽  
Zibin Pan

Buildings account for a large proportion of the total energy consumption in many countries and almost half of the energy consumption is caused by the Heating, Ventilation, and air-conditioning (HVAC) systems. The model predictive control of HVAC is a complex task due to the dynamic property of the system and environment, such as temperature and electricity price. Deep reinforcement learning (DRL) is a model-free method that utilizes the “trial and error” mechanism to learn the optimal policy. However, the learning efficiency and learning cost are the main obstacles of the DRL method to practice. To overcome this problem, the hybrid-model-based DRL method is proposed for the HVAC control problem. Firstly, a specific MDPs is defined by considering the energy cost, temperature violation, and action violation. Then the hybrid-model-based DRL method is proposed, which utilizes both the knowledge-driven model and the data-driven model during the whole learning process. Finally, the protection mechanism and adjusting reward methods are used to further reduce the learning cost. The proposed method is tested in a simulation environment using the Australian Energy Market Operator (AEMO) electricity price data and New South Wales temperature data. Simulation results show that 1) the DRL method can reduce the energy cost while maintaining the temperature satisfactory compared to the short term MPC method; 2) the proposed method improves the learning efficiency and reduces the learning cost during the learning process compared to the model-free method.


2020 ◽  
Author(s):  
Felipe Leno Da Silva ◽  
Anna Helena Reali Costa

Reinforcement Learning (RL) is a powerful tool that has been used to solve increasingly complex tasks. RL operates through repeated interactions of the learning agent with the environment, via trial and error. However, this learning process is extremely slow, requiring many interactions. In this thesis, we leverage previous knowledge so as to accelerate learning in multiagent RL problems. We propose knowledge reuse both from previous tasks and from other agents. Several flexible methods are introduced so that each of these two types of knowledge reuse is possible. This thesis adds important steps towards more flexible and broadly applicable multiagent transfer learning methods.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Tatiana Lau ◽  
Samuel J Gershman ◽  
Mina Cikara

Humans form social coalitions in every society, yet we know little about how we learn and represent social group boundaries. Here we derive predictions from a computational model of latent structure learning to move beyond explicit category labels and interpersonal, or dyadic, similarity as the sole inputs to social group representations. Using a model-based analysis of functional neuroimaging data, we find that separate areas correlate with dyadic similarity and latent structure learning. Trial-by-trial estimates of ‘allyship’ based on dyadic similarity between participants and each agent recruited medial prefrontal cortex/pregenual anterior cingulate (pgACC). Latent social group structure-based allyship estimates, in contrast, recruited right anterior insula (rAI). Variability in the brain signal from rAI improved prediction of variability in ally-choice behavior, whereas variability from the pgACC did not. These results provide novel insights into the psychological and neural mechanisms by which people learn to distinguish ‘us’ from ‘them.’


Author(s):  
Peng Zhang ◽  
Jianye Hao ◽  
Weixun Wang ◽  
Hongyao Tang ◽  
Yi Ma ◽  
...  

Reinforcement learning agents usually learn from scratch, which requires a large number of interactions with the environment. This is quite different from the learning process of human. When faced with a new task, human naturally have the common sense and use the prior knowledge to derive an initial policy and guide the learning process afterwards. Although the prior knowledge may be not fully applicable to the new task, the learning process is significantly sped up since the initial policy ensures a quick-start of learning and intermediate guidance allows to avoid unnecessary exploration. Taking this inspiration, we propose knowledge guided policy network (KoGuN), a novel framework that combines human prior suboptimal knowledge with reinforcement learning. Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to finetune suboptimal prior knowledge. The proposed framework is end-to-end and can be combined with existing policy-based reinforcement learning algorithm. We conduct experiments on several control tasks. The empirical results show that our approach, which combines suboptimal human knowledge and RL, achieves significant improvement on learning efficiency of flat RL algorithms, even with very low-performance human prior knowledge.


Sign in / Sign up

Export Citation Format

Share Document