scholarly journals The evolving view of replay and its functions in wake and sleep

2020 ◽  
Vol 1 (1) ◽  
Author(s):  
Graham Findlay ◽  
Giulio Tononi ◽  
Chiara Cirelli

Abstract The term hippocampal replay originally referred to the temporally compressed reinstantiation, during rest, of sequential neural activity observed during prior active wake. Since its description in the 1990s, hippocampal replay has often been viewed as the key mechanism by which a memory trace is repeatedly rehearsed at high speeds during sleep and gradually transferred to neocortical circuits. However, the methods used to measure the occurrence of replay remain debated, and it is now clear that the underlying neural events are considerably more complicated than the traditional narratives had suggested. “Replay-like” activity happens during wake, can play out in reverse order, may represent trajectories never taken by the animal, and may have additional functions beyond memory consolidation, from learning values and solving the problem of credit assignment to decision-making and planning. Still, we know little about the role of replay in cognition, and to what extent it differs between wake and sleep. This may soon change, however, because decades-long efforts to explain replay in terms of reinforcement learning (RL) have started to yield testable predictions and possible explanations for a diverse set of observations. Here, we (1) survey the diverse features of replay, focusing especially on the latest findings; (2) discuss recent attempts at unifying disparate experimental results and putatively different cognitive functions under the banner of RL; (3) discuss methodological issues and theoretical biases that impede progress or may warrant a partial revaluation of the current literature, and finally; (4) highlight areas of considerable uncertainty and promising avenues of inquiry.

1969 ◽  
Vol 24 (2) ◽  
pp. 580-582 ◽  
Author(s):  
Thomas L. Bennett

Adey and his associates have asserted that theta electrical activity recorded from the hippocampus during learning and performance reflects the role of this structure in information processing, decision making and memory consolidation. This notion was recently questioned by Douglas (1967) who concluded that the tasks employed by Adey and his associates to assess theta activity were tasks which the lesion literature indicated do not requite hippocampal functioning to be learned. The present paper questions Douglas' assertion by describing studies in the lesion literature which demonstrate that the tasks used by Adey and his co-workers may actually require hippocampal functioning to be learned.


2021 ◽  
Vol 14 (1) ◽  
pp. 17
Author(s):  
Shuailong Li ◽  
Wei Zhang ◽  
Yuquan Leng ◽  
Xiaohui Wang

Environmental information plays an important role in deep reinforcement learning (DRL). However, many algorithms do not pay much attention to environmental information. In multi-agent reinforcement learning decision-making, because agents need to make decisions combined with the information of other agents in the environment, this makes the environmental information more important. To prove the importance of environmental information, we added environmental information to the algorithm. We evaluated many algorithms on a challenging set of StarCraft II micromanagement tasks. Compared with the original algorithm, the standard deviation (except for the VDN algorithm) was smaller than that of the original algorithm, which shows that our algorithm has better stability. The average score of our algorithm was higher than that of the original algorithm (except for VDN and COMA), which shows that our work significantly outperforms existing multi-agent RL methods.


2020 ◽  
Author(s):  
Milena Rmus ◽  
Samuel McDougle ◽  
Anne Collins

Reinforcement learning (RL) models have advanced our understanding of how animals learn and make decisions, and how the brain supports some aspects of learning. However, the neural computations that are explained by RL algorithms fall short of explaining many sophisticated aspects of human decision making, including the generalization of learned information, one-shot learning, and the synthesis of task information in complex environments. Instead, these aspects of instrumental behavior are assumed to be supported by the brain’s executive functions (EF). We review recent findings that highlight the importance of EF in learning. Specifically, we advance the theory that EF sets the stage for canonical RL computations in the brain, providing inputs that broaden their flexibility and applicability. Our theory has important implications for how to interpret RL computations in the brain and behavior.


2019 ◽  
Author(s):  
Ji Won Bang ◽  
Dobromir Rahnev

AbstractPreviously learned information is known to be reactivated during periods of quiet wakefulness and such awake reactivation is considered to be a key mechanism for memory consolidation. We recently demonstrated that feature-specific awake reactivation occurs in early visual cortex immediately after extensive visual training on a novel task. To understand the exact role of awake reactivation, here we investigated whether such reactivation depends specifically on the task novelty. Subjects completed a brief visual task that was either novel or extensively trained on previous days. Replicating our previous results, we found that awake reactivation occurs for the novel task even after a brief learning period. Surprisingly, however, brief exposure to the extensively trained task led to “awake suppression” such that neural activity immediately after the exposure diverged from the pattern for the trained task. Further, subjects who had greater performance improvement showed stronger awake suppression. These results suggest that the brain operates different post-task processing depending on prior visual training.


2016 ◽  
Vol 113 (24) ◽  
pp. 6797-6802 ◽  
Author(s):  
Samuel D. McDougle ◽  
Matthew J. Boggess ◽  
Matthew J. Crossley ◽  
Darius Parvin ◽  
Richard B. Ivry ◽  
...  

When a person fails to obtain an expected reward from an object in the environment, they face a credit assignment problem: Did the absence of reward reflect an extrinsic property of the environment or an intrinsic error in motor execution? To explore this problem, we modified a popular decision-making task used in studies of reinforcement learning, the two-armed bandit task. We compared a version in which choices were indicated by key presses, the standard response in such tasks, to a version in which the choices were indicated by reaching movements, which affords execution failures. In the key press condition, participants exhibited a strong risk aversion bias; strikingly, this bias reversed in the reaching condition. This result can be explained by a reinforcement model wherein movement errors influence decision-making, either by gating reward prediction errors or by modifying an implicit representation of motor competence. Two further experiments support the gating hypothesis. First, we used a condition in which we provided visual cues indicative of movement errors but informed the participants that trial outcomes were independent of their actual movements. The main result was replicated, indicating that the gating process is independent of participants’ explicit sense of control. Second, individuals with cerebellar degeneration failed to modulate their behavior between the key press and reach conditions, providing converging evidence of an implicit influence of movement error signals on reinforcement learning. These results provide a mechanistically tractable solution to the credit assignment problem.


2017 ◽  
Vol 29 (11) ◽  
pp. 2861-2886 ◽  
Author(s):  
Alex T. Piet ◽  
Jeffrey C. Erlich ◽  
Charles D. Kopec ◽  
Carlos D. Brody

Two-node attractor networks are flexible models for neural activity during decision making. Depending on the network configuration, these networks can model distinct aspects of decisions including evidence integration, evidence categorization, and decision memory. Here, we use attractor networks to model recent causal perturbations of the frontal orienting fields (FOF) in rat cortex during a perceptual decision-making task (Erlich, Brunton, Duan, Hanks, & Brody, 2015 ). We focus on a striking feature of the perturbation results. Pharmacological silencing of the FOF resulted in a stimulus-independent bias. We fit several models to test whether integration, categorization, or decision memory could account for this bias and found that only the memory configuration successfully accounts for it. This memory model naturally accounts for optogenetic perturbations of FOF in the same task and correctly predicts a memory-duration-dependent deficit caused by silencing FOF in a different task. Our results provide mechanistic support for a “postcategorization” memory role of the FOF in upcoming choices.


2017 ◽  
Vol 29 (2) ◽  
pp. 368-393 ◽  
Author(s):  
Nils Kurzawa ◽  
Christopher Summerfield ◽  
Rafal Bogacz

Much experimental evidence suggests that during decision making, neural circuits accumulate evidence supporting alternative options. A computational model well describing this accumulation for choices between two options assumes that the brain integrates the log ratios of the likelihoods of the sensory inputs given the two options. Several models have been proposed for how neural circuits can learn these log-likelihood ratios from experience, but all of these models introduced novel and specially dedicated synaptic plasticity rules. Here we show that for a certain wide class of tasks, the log-likelihood ratios are approximately linearly proportional to the expected rewards for selecting actions. Therefore, a simple model based on standard reinforcement learning rules is able to estimate the log-likelihood ratios from experience and on each trial accumulate the log-likelihood ratios associated with presented stimuli while selecting an action. The simulations of the model replicate experimental data on both behavior and neural activity in tasks requiring accumulation of probabilistic cues. Our results suggest that there is no need for the brain to support dedicated plasticity rules, as the standard mechanisms proposed to describe reinforcement learning can enable the neural circuits to perform efficient probabilistic inference.


Sign in / Sign up

Export Citation Format

Share Document