scholarly journals Atypical Reinforcement Learning in Developmental Dyslexia

Author(s):  
Atheer Odah Massarwe ◽  
Noyli Nissan ◽  
Yafit Gabay

Abstract Objectives: According to the Procedural Deficit Hypothesis, abnormalities in corticostriatal pathways could account for the language-related deficits observed in developmental dyslexia. The same neural network has also been implicated in the ability to learn contingencies based on trial and error (i.e., reinforcement learning [RL]). On this basis, the present study tested the assumption that dyslexic individuals would be impaired in RL compared with neurotypicals in two different tasks. Methods: In a probabilistic selection task, participants were required to learn reinforcement contingencies based on probabilistic feedback. In an implicit transitive inference task, participants were also required to base their decisions on reinforcement histories, but feedback was deterministic and stimulus pairs were partially overlapping, such that participants were required to learn hierarchical relations. Results: Across tasks, results revealed that although the ability to learn from positive/negative feedback did not differ between the two groups, the learning of reinforcement contingencies was poorer in the dyslexia group compared with the neurotypicals group. Furthermore, in novel test pairs where previously learned information was presented in new combinations, dyslexic individuals performed similarly to neurotypicals. Conclusions: Taken together, these results suggest that learning of reinforcement contingencies occurs less robustly in individuals with developmental dyslexia. Inferences for the neuro-cognitive mechanisms of developmental dyslexia are discussed.

2019 ◽  
Vol 1 (2) ◽  
pp. 74-84
Author(s):  
Evan Kusuma Susanto ◽  
Yosi Kristian

Asynchronous Advantage Actor-Critic (A3C) adalah sebuah algoritma deep reinforcement learning yang dikembangkan oleh Google DeepMind. Algoritma ini dapat digunakan untuk menciptakan sebuah arsitektur artificial intelligence yang dapat menguasai berbagai jenis game yang berbeda melalui trial and error dengan mempelajari tempilan layar game dan skor yang diperoleh dari hasil tindakannya tanpa campur tangan manusia. Sebuah network A3C terdiri dari Convolutional Neural Network (CNN) di bagian depan, Long Short-Term Memory Network (LSTM) di tengah, dan sebuah Actor-Critic network di bagian belakang. CNN berguna sebagai perangkum dari citra output layar dengan mengekstrak fitur-fitur yang penting yang terdapat pada layar. LSTM berguna sebagai pengingat keadaan game sebelumnya. Actor-Critic Network berguna untuk menentukan tindakan terbaik untuk dilakukan ketika dihadapkan dengan suatu kondisi tertentu. Dari hasil percobaan yang dilakukan, metode ini cukup efektif dan dapat mengalahkan pemain pemula dalam memainkan 5 game yang digunakan sebagai bahan uji coba.


2007 ◽  
Vol 362 (1479) ◽  
pp. 383-401 ◽  
Author(s):  
Francesco Mannella ◽  
Gianluca Baldassarre

Previous experiments have shown that when domestic chicks ( Gallus gallus ) are first trained to locate food elements hidden at the centre of a closed square arena and then are tested in a square arena of double the size, they search for food both at its centre and at a distance from walls similar to the distance of the centre from the walls experienced during training. This paper presents a computational model that successfully reproduces these behaviours. The model is based on a neural-network implementation of the reinforcement-learning actor–critic architecture (in this architecture the ‘critic’ learns to evaluate perceived states in terms of predicted future rewards, while the ‘actor’ learns to increase the probability of selecting the actions that lead to higher evaluations). The analysis of the model suggests which type of information and cognitive mechanisms might underlie chicks' behaviours: (i) the tendency to explore the area at a specific distance from walls might be based on the processing of the height of walls' horizontal edges, (ii) the capacity to generalize the search at the centre of square arenas independently of their size might be based on the processing of the relative position of walls' vertical edges on the horizontal plane (equalization of walls' width), and (iii) the whole behaviour exhibited in the large square arena can be reproduced by assuming the existence of an attention process that, at each time, focuses chicks' internal processing on either one of the two previously discussed information sources. The model also produces testable predictions regarding the generalization capabilities that real chicks should exhibit if trained in circular arenas of varying size. The paper also highlights the potentialities of the model to address other experiments on animals' navigation and analyses its strengths and weaknesses in comparison to other models.


2021 ◽  
Vol 2 (1) ◽  
pp. 1-25
Author(s):  
Yongsen Ma ◽  
Sheheryar Arshad ◽  
Swetha Muniraju ◽  
Eric Torkildson ◽  
Enrico Rantala ◽  
...  

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.


2021 ◽  
Vol 11 (7) ◽  
pp. 3257
Author(s):  
Chen-Huan Pi ◽  
Wei-Yuan Ye ◽  
Stone Cheng

In this paper, a novel control strategy is presented for reinforcement learning with disturbance compensation to solve the problem of quadrotor positioning under external disturbance. The proposed control scheme applies a trained neural-network-based reinforcement learning agent to control the quadrotor, and its output is directly mapped to four actuators in an end-to-end manner. The proposed control scheme constructs a disturbance observer to estimate the external forces exerted on the three axes of the quadrotor, such as wind gusts in an outdoor environment. By introducing an interference compensator into the neural network control agent, the tracking accuracy and robustness were significantly increased in indoor and outdoor experiments. The experimental results indicate that the proposed control strategy is highly robust to external disturbances. In the experiments, compensation improved control accuracy and reduced positioning error by 75%. To the best of our knowledge, this study is the first to achieve quadrotor positioning control through low-level reinforcement learning by using a global positioning system in an outdoor environment.


2021 ◽  
Vol 36 ◽  
Author(s):  
Sergio Valcarcel Macua ◽  
Ian Davies ◽  
Aleksi Tukiainen ◽  
Enrique Munoz de Cote

Abstract We propose a fully distributed actor-critic architecture, named diffusion-distributed-actor-critic Diff-DAC, with application to multitask reinforcement learning (MRL). During the learning process, agents communicate their value and policy parameters to their neighbours, diffusing the information across a network of agents with no need for a central station. Each agent can only access data from its local task, but aims to learn a common policy that performs well for the whole set of tasks. The architecture is scalable, since the computational and communication cost per agent depends on the number of neighbours rather than the overall number of agents. We derive Diff-DAC from duality theory and provide novel insights into the actor-critic framework, showing that it is actually an instance of the dual-ascent method. We prove almost sure convergence of Diff-DAC to a common policy under general assumptions that hold even for deep neural network approximations. For more restrictive assumptions, we also prove that this common policy is a stationary point of an approximation of the original problem. Numerical results on multitask extensions of common continuous control benchmarks demonstrate that Diff-DAC stabilises learning and has a regularising effect that induces higher performance and better generalisation properties than previous architectures.


Sign in / Sign up

Export Citation Format

Share Document