scholarly journals Spatial preferences account for inter-animal variability during the continual learning of a dynamic cognitive task

2019 ◽  
Author(s):  
David B. Kastner ◽  
Eric A. Miller ◽  
Zhounan Yang ◽  
Demetris K. Roumis ◽  
Daniel F. Liu ◽  
...  

AbstractIndividual animals perform tasks in different ways, yet the nature and origin of that variability is poorly understood. In the context of spatial memory tasks, variability is often interpreted as resulting from differences in memory ability, but the validity of this interpretation is seldom tested since we lack a systematic approach for identifying and understanding factors that make one animal’s behavior different than another. Here we identify such factors in the context of spatial alternation in rats, a task often described as relying solely on memory of past choices. We combine hypothesis-driven behavioral design and reinforcement learning modeling to identify spatial preferences that, when combined with memory, support learning of a spatial alternation task. Identifying these preferences allows us to capture differences among animals, including differences in overall learning ability. Our results show that to understand the complexity of behavior requires quantitative accounts of the preferences of each animal.

Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 471
Author(s):  
Jai Hoon Park ◽  
Kang Hoon Lee

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.


2009 ◽  
Vol 102 (1) ◽  
pp. 556-567 ◽  
Author(s):  
Muneyoshi Takahashi ◽  
Johan Lauwereyns ◽  
Yoshio Sakurai ◽  
Minoru Tsukada

The classical notion of hippocampal CA1 “place cells,” whose activity tracks physical locations, has undergone substantial revision in recent years. Here, we provide further evidence of an abstract spatial code in hippocampal CA1, which relies on memory and adds complexity to the basic “place cell.” Using a nose-poking paradigm with four male Wistar rats, we specifically concentrated on activity during fixation, when the rat was immobile and waiting for the next task event in a memory-guided spatial alternation task. The rat had to alternate between choosing the right and left holes on a trial-by-trial basis, without any sensory cue, and relying on an internal representation of the sequence of trials. Twelve tetrodes were chronically implanted for single-unit recording in the right CA1 of each rat. We focus on 76 single neurons that showed significant activation during the fixation period compared with baseline activity between trials. Among these 76 fixation neurons, we observed 38 neurons that systematically changed their fixation activity as a function of the alternation sequence. That is, even though these rats were immobile during the fixation period, the neurons fired differently for trials in which the next spatial choice should be left (i.e., RIGHT-TO-LEFT trials) compared with trials in which the next spatial choice should be right (i.e., LEFT-TO-RIGHT trials), or vice versa. Our results imply that these neurons maintain a sequential code of the required spatial response during the alternation task and thus provide abstract information, derived from memory, that can be used for efficient navigation.


2009 ◽  
Vol 56 (4) ◽  
pp. 382-390 ◽  
Author(s):  
Victor C. Wang ◽  
Steven L. Neese ◽  
Donna L. Korol ◽  
Susan L. Schantz

2016 ◽  
Vol 28 (10) ◽  
pp. 1539-1552 ◽  
Author(s):  
Björn C. Schiffler ◽  
Rita Almeida ◽  
Mathias Granqvist ◽  
Sara L. Bengtsson

Negative feedback after an action in a cognitive task can lead to devaluing that action on future trials as well as to more cautious responding when encountering that same choice again. These phenomena have been explored in the past by reinforcement learning theories and cognitive control accounts, respectively. Yet, how cognitive control interacts with value updating to give rise to adequate adaptations under uncertainty is less clear. In this fMRI study, we investigated cognitive control-based behavioral adjustments during a probabilistic reinforcement learning task and studied their influence on performance in a later test phase in which the learned value of items is tested. We provide support for the idea that functionally relevant and memory-reliant behavioral adjustments in the form of post-error slowing during reinforcement learning are associated with test performance. Adjusting response speed after negative feedback was correlated with BOLD activity in right inferior frontal gyrus and bilateral middle occipital cortex during the event of receiving the feedback. Bilateral middle occipital cortex activity overlapped partly with activity reflecting feedback deviance from expectations as measured by unsigned prediction error. These results suggest that cognitive control and feature processing cortical regions interact to implement feedback-congruent adaptations beneficial to learning.


2021 ◽  
Author(s):  
Tiantian Zhang ◽  
Xueqian Wang ◽  
Bin Liang ◽  
Bo Yuan

The powerful learning ability of deep neural networks enables reinforcement learning (RL) agents to learn competent control policies directly from high-dimensional and continuous environments. In theory, to achieve stable performance, neural networks assume i.i.d. inputs, which unfortunately does no hold in the general RL paradigm where the training data is temporally correlated and non-stationary. This issue may lead to the phenomenon of "catastrophic interference" (a.k.a. "catastrophic forgetting") and the collapse in performance as later training is likely to overwrite and interfer with previously learned good policies. In this paper, we introduce the concept of "context" into the single-task RL and develop a novel scheme, termed as Context Division and Knowledge Distillation (CDaKD) driven RL, to divide all states experienced during training into a series of contexts. Its motivation is to mitigate the challenge of aforementioned catastrophic interference in deep RL, thereby improving the stability and plasticity of RL models. At the heart of CDaKD is a value function, parameterized by a neural network feature extractor shared across all contexts, and a set of output heads, each specializing on an individual context. In CDaKD, we exploit online clustering to achieve context division, and interference is further alleviated by a knowledge distillation regularization term on the output layers for learned contexts. In addition, to effectively obtain the context division in high-dimensional state spaces (e.g., image inputs), we perform clustering in the lower-dimensional representation space of a randomly initialized convolutional encoder, which is fixed throughout training. Our results show that, with various replay memory capacities, CDaKD can consistently improve the performance of existing RL algorithms on classic OpenAI Gym tasks and the more complex high-dimensional Atari tasks, incurring only moderate computational overhead.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Pengfei Ma ◽  
Zunqian Zhang ◽  
Jiahao Wang ◽  
Wei Zhang ◽  
Jiajia Liu ◽  
...  

In recent years, artificial intelligence supported by big data has gradually become more dependent on deep reinforcement learning. However, the application of deep reinforcement learning in artificial intelligence is limited by prior knowledge and model selection, which further affects the efficiency and accuracy of prediction, and also fails to realize the learning ability of autonomous learning and prediction. Metalearning came into being because of this. Through learning the information metaknowledge, the ability to autonomously judge and select the appropriate model can be formed, and the parameters can be adjusted independently to achieve further optimization. It is a novel method to solve big data problems in the current neural network model, and it adapts to the development trend of artificial intelligence. This article first briefly introduces the research process and basic theory of metalearning and discusses the differences between metalearning and machine learning and the research direction of metalearning in big data. Then, four typical applications of metalearning in the field of artificial intelligence are summarized: few-shot learning, robot learning, unsupervised learning, and intelligent medicine. Then, the challenges and solutions of metalearning are analyzed. Finally, a systematic summary of the full text is made, and the future development prospect of this field is assessed.


Sign in / Sign up

Export Citation Format

Share Document