AUTONOMOUS LEARNING OF THE SEMANTICS OF INTERNAL SENSORY STATES BASED ON MOTOR EXPLORATION

2007 ◽  
Vol 04 (02) ◽  
pp. 211-243 ◽  
Author(s):  
YOONSUCK CHOE ◽  
HUEI-FANG YANG ◽  
DANIEL CHERN-YEOW ENG

What is available to developmental programs in autonomous mental development, and what should be learned at the very early stages of mental development? Our observation is that sensory and motor primitives are the most basic components present at the beginning, and what developmental agents need to learn from these resources is what their internal sensory states stand for. In this paper, we investigate the question in the context of a simple biologically motivated visuomotor agent. We observe and acknowledge, as many other researchers do, that action plays a key role in providing content to the sensory state. We propose a simple, yet powerful learning criterion, that of invariance, where invariance simply means that the internal state does not change over time. We show that after reinforcement learning based on the invariance criterion, the property of action sequence based on an internal sensory state accurately reflects the property of the stimulus that triggered that internal state. That way, the meaning of the internal sensory state can be firmly grounded on the property of that particular action sequence. We expect the framing of the problem and the proposed solution presented in this paper to help shed new light on autonomous understanding in developmental agents such as humanoid robots.

Author(s):  
Heecheol Kim ◽  
Masanori Yamada ◽  
Kosuke Miyoshi ◽  
Tomoharu Iwata ◽  
Hiroshi Yamakawa

2020 ◽  
Author(s):  
Than Le

<p>In this chapter, we address the competent Autonomous Vehicles should have the ability to analyze the structure and unstructured environments and then to localize itself relative to surrounding things, where GPS, RFID or other similar means cannot give enough information about the location. Reliable SLAM is the most basic prerequisite for any further artificial intelligent tasks of an autonomous mobile robots. The goal of this paper is to simulate a SLAM process on the advanced software development. The model represents the system itself, whereas the simulation represents the operation of the system over time. And the software architecture will help us to focus our work to realize our wish with least trivial work. It is an open-source meta-operating system, which provides us tremendous tools for robotics related problems.</p> <p>Specifically, we address the advanced vehicles should have the ability to analyze the structured and unstructured environment based on solving the search-based planning and then we move to discuss interested in reinforcement learning-based model to optimal trajectory in order to apply to autonomous systems.</p>


2021 ◽  
Vol 33 (1) ◽  
pp. 129-156
Author(s):  
Masami Iwamoto ◽  
Daichi Kato

This letter proposes a new idea to improve learning efficiency in reinforcement learning (RL) with the actor-critic method used as a muscle controller for posture stabilization of the human arm. Actor-critic RL (ACRL) is used for simulations to realize posture controls in humans or robots using muscle tension control. However, it requires very high computational costs to acquire a better muscle control policy for desirable postures. For efficient ACRL, we focused on embodiment that is supposed to potentially achieve efficient controls in research fields of artificial intelligence or robotics. According to the neurophysiology of motion control obtained from experimental studies using animals or humans, the pedunculopontine tegmental nucleus (PPTn) induces muscle tone suppression, and the midbrain locomotor region (MLR) induces muscle tone promotion. PPTn and MLR modulate the activation levels of mutually antagonizing muscles such as flexors and extensors in a process through which control signals are translated from the substantia nigra reticulata to the brain stem. Therefore, we hypothesized that the PPTn and MLR could control muscle tone, that is, the maximum values of activation levels of mutually antagonizing muscles using different sigmoidal functions for each muscle; then we introduced antagonism function models (AFMs) of PPTn and MLR for individual muscles, incorporating the hypothesis into the process to determine the activation level of each muscle based on the output of the actor in ACRL. ACRL with AFMs representing the embodiment of muscle tone successfully achieved posture stabilization in five joint motions of the right arm of a human adult male under gravity in predetermined target angles at an earlier period of learning than the learning methods without AFMs. The results obtained from this study suggest that the introduction of embodiment of muscle tone can enhance learning efficiency in posture stabilization disorders of humans or humanoid robots.


Author(s):  
Dongshuo Wang ◽  
Bin Zou ◽  
Minjie Xing

Language learners at all levels need a way of recording and organising newly learned vocabulary for consolidation and for future reference. Listing words alphabetically in a vocabulary notebook has been a traditional way of organising this information. However, paper-based notes are limited in terms of space (learners often run out of space for certain categories; for others the space might be unused) and time (handwritten pages deteriorate over time and cannot easily be updated). Organizing vocabulary in more meaningful categories might make it easier to learn. Textbooks, for example, often introduce new vocabulary thematically. Words can also be organised according to their grammatical class or characteristics, their real world category (e.g. modes of transport, means of communication), their phonological pattern, their etymological elements, or according to when/where they were learnt. This research experiments how the mobile learning of a lexical spreadsheet can be used for the consolidation of and reference to new vocabulary. Offering the learner multiple ways of organising vocabulary at the same time – combining all of the approaches mentioned above, the resource can easily be modified and updated. Importantly, in keeping with autonomous learning theory, the spreadsheet is designed to encourage learners to take more responsibility for their own vocabulary learning and to approach this process more systematically. The resource can be used from any mobile smart phone, tablet or i-Pad.


2010 ◽  
Vol 16 (1) ◽  
pp. 21-37 ◽  
Author(s):  
Chris Marriott ◽  
James Parker ◽  
Jörg Denzinger

We study the effects of an imitation mechanism on a population of animats capable of individual ontogenetic learning. An urge to imitate others augments a network-based reinforcement learning strategy used in the control system of the animats. We test populations of animats with imitation against populations without for their ability to find, and maintain over generations, successful foraging behavior in an environment containing three necessary resources: food, water, and shelter. We conclude that even simple imitation mechanisms are effective at increasing the frequency of success when measured over time and over populations of animats.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
A. Hamann ◽  
V. Dunjko ◽  
S. Wölk

AbstractIn recent years, quantum-enhanced machine learning has emerged as a particularly fruitful application of quantum algorithms, covering aspects of supervised, unsupervised and reinforcement learning. Reinforcement learning offers numerous options of how quantum theory can be applied, and is arguably the least explored, from a quantum perspective. Here, an agent explores an environment and tries to find a behavior optimizing some figure of merit. Some of the first approaches investigated settings where this exploration can be sped-up, by considering quantum analogs of classical environments, which can then be queried in superposition. If the environments have a strict periodic structure in time (i.e. are strictly episodic), such environments can be effectively converted to conventional oracles encountered in quantum information. However, in general environments, we obtain scenarios that generalize standard oracle tasks. In this work, we consider one such generalization, where the environment is not strictly episodic, which is mapped to an oracle identification setting with a changing oracle. We analyze this case and show that standard amplitude-amplification techniques can, with minor modifications, still be applied to achieve quadratic speed-ups. In addition, we prove that an algorithm based on Grover iterations is optimal for oracle identification even if the oracle changes over time in a way that the “rewarded space” is monotonically increasing. This result constitutes one of the first generalizations of quantum-accessible reinforcement learning.


2021 ◽  
Author(s):  
Annik Yalnizyan-Carson ◽  
Blake A Richards

Forgetting is a normal process in healthy brains, and evidence suggests that the mammalian brain forgets more than is required based on limitations of mnemonic capacity. Episodic memories, in particular, are liable to be forgotten over time. Researchers have hypothesized that it may be beneficial for decision making to forget episodic memories over time. Reinforcement learning offers a normative framework in which to test such hypotheses. Here, we show that a reinforcement learning agent that uses an episodic memory cache to find rewards in maze environments can forget a large percentage of older memories without any performance impairments, if they utilize mnemonic representations that contain structural information about space. Moreover, we show that some forgetting can actually provide a benefit in performance compared to agents with unbounded memories. Our analyses of the agents show that forgetting reduces the influence of outdated information and states which are not frequently visited on the policies produced by the episodic control system. These results support the hypothesis that some degree of forgetting can be beneficial for decision making, which can help to explain why the brain forgets more than is required by capacity limitations.


2020 ◽  
Author(s):  
Than Le

<p>In this chapter, we address the competent Autonomous Vehicles should have the ability to analyze the structure and unstructured environments and then to localize itself relative to surrounding things, where GPS, RFID or other similar means cannot give enough information about the location. Reliable SLAM is the most basic prerequisite for any further artificial intelligent tasks of an autonomous mobile robots. The goal of this paper is to simulate a SLAM process on the advanced software development. The model represents the system itself, whereas the simulation represents the operation of the system over time. And the software architecture will help us to focus our work to realize our wish with least trivial work. It is an open-source meta-operating system, which provides us tremendous tools for robotics related problems.</p> <p>Specifically, we address the advanced vehicles should have the ability to analyze the structured and unstructured environment based on solving the search-based planning and then we move to discuss interested in reinforcement learning-based model to optimal trajectory in order to apply to autonomous systems.</p>


Sign in / Sign up

Export Citation Format

Share Document