reinforcement learn
Recently Published Documents


TOTAL DOCUMENTS

2
(FIVE YEARS 1)

H-INDEX

1
(FIVE YEARS 0)

2021 ◽  
pp. 016-025
Author(s):  
A.Y. Doroshenko ◽  
◽  
I.Z. Achour ◽  

Reinforced learning is a field of machine learning based on how software agents should perform actions in the environment to maximize the concept of cumulative reward. This paper proposes a new application of machine reinforcement learning techniques in the form of neuro-evolution of augmenting topologies to solve control automation problems using modeling control problems of technical systems. Key application components include OpenAI Gym toolkit for develop-ing and comparing reinforcement learn-ing algorithms, full-fledged open-source implementation of the NEAT genetic al-gorithm called SharpNEAT, and inter-mediate software for orchestration of these components. The algorithm of neu-roevolution of augmenting topologies demonstrates the finding of efficient neural networks on the example of a simple standard problem with continu-ous control from OpenAI Gym.


2017 ◽  
Author(s):  
Matthias Guggenmos ◽  
Philipp Sterzer

AbstractIt is well established that learning can occur without external feedback, yet normative reinforcement learning theories have difficulties explaining such instances of learning. Recently, we reported on a confidence-based reinforcement learn-ing model for the model case of perceptual learning (Guggenmos, Wilbertz, Hebart, & Sterzer, 2016), according to which the brain capitalizes on internal monitoring processes when no external feedback is available. In the model, internal confidence prediction errors – the difference between current confidence and expected confidence – serve as teaching signals to guide learning. In the present paper, we explore an extension to this idea. The main idea is that the neural information processing pathways activated for a given sensory stimulus are subject to fluctuations, where some pathway configurations lead to higher confidence than others. Confidence prediction errors strengthen pathway configurations for which fluctuations lead to above-average confidence and weaken those that are associated with below-average con-fidence. We show through simulation that the model is capable of self-reinforced perceptual learning and can benefit from exploratory network fluctuations. In addition, by simulating different model parameters, we show that the ideal confidence-based learner should (i) exhibit high degrees of network fluctuation in the initial phase of learning, but re-duced fluctuations as learning progresses, (ii) have a high learning rate for network updates for immediate performance gains, but a low learning rate for long-term maximum performance, and (iii) be neither too under-nor too overconfident. In sum, we present a model in which confidence prediction errors strengthen favorable network fluctuations and enable learning in the absence of external feedback. The model can be readily applied to data of real-world perceptual tasks in which observers provided both choice and confidence reports.


Sign in / Sign up

Export Citation Format

Share Document