scholarly journals OODA-RL: A REINFORCEMENT LEARNING FRAMEWORK FOR ARTIFICIAL GENERAL INTELLIGENCE TO SOLVE OPEN WORLD NOVELTY

Author(s):  
Pamul Yadav ◽  
Taewoo Kim ◽  
Ho Suk ◽  
Junyong Lee ◽  
Hyeonseong Jeong ◽  
...  

<p>Faster adaptability to open-world novelties by intelligent agents is a necessary factor in achieving the goal of creating Artificial General Intelligence (AGI). Current RL framework does not considers the unseen changes (novelties) in the environment. Therefore, in this paper, we have proposed OODA-RL, a Reinforcement Learning based framework that can be used to develop robust RL algorithms capable of handling both the known environments as well as adaptation to the unseen environments. OODA-RL expands the definition of internal composition of the agent as compared to the abstract definition in the classical RL framework, allowing the RL researchers to incorporate novelty adaptation techniques as an add-on feature to the existing SoTA as well as yet-to-be-developed RL algorithms.</p>

2021 ◽  
Author(s):  
Pamul Yadav ◽  
Taewoo Kim ◽  
Ho Suk ◽  
Junyong Lee ◽  
Hyeonseong Jeong ◽  
...  

<p>Faster adaptability to open-world novelties by intelligent agents is a necessary factor in achieving the goal of creating Artificial General Intelligence (AGI). Current RL framework does not considers the unseen changes (novelties) in the environment. Therefore, in this paper, we have proposed OODA-RL, a Reinforcement Learning based framework that can be used to develop robust RL algorithms capable of handling both the known environments as well as adaptation to the unseen environments. OODA-RL expands the definition of internal composition of the agent as compared to the abstract definition in the classical RL framework, allowing the RL researchers to incorporate novelty adaptation techniques as an add-on feature to the existing SoTA as well as yet-to-be-developed RL algorithms.</p>


2021 ◽  
Author(s):  
Pamul Yadav ◽  
Taewoo Kim ◽  
Ho Suk ◽  
Junyong Lee ◽  
Hyeonseong Jeong ◽  
...  

<p>Faster adaptability to open-world novelties by intelligent agents is a necessary factor in achieving the goal of creating Artificial General Intelligence (AGI). Current RL framework does not considers the unseen changes (novelties) in the environment. Therefore, in this paper, we have proposed OODA-RL, a Reinforcement Learning based framework that can be used to develop robust RL algorithms capable of handling both the known environments as well as adaptation to the unseen environments. OODA-RL expands the definition of internal composition of the agent as compared to the abstract definition in the classical RL framework, allowing the RL researchers to incorporate novelty adaptation techniques as an add-on feature to the existing SoTA as well as yet-to-be-developed RL algorithms.</p>


2020 ◽  
Author(s):  
Andy E Williams

INTRODUCTION: With advances in big data techniques having already led to search results and advertising being customized to the individual user, the concept of an online education designed solely for an individual, or the concept of online news or entertainment media, or any other virtual service being designed uniquely for each individual, no longer seems as far fetched. However, designing services that maximize user outcomes as opposed to services that maximize outcomes for the corporation owning them, requires modeling user processes and the outcomes they target.OBJECTIVES: To explore the use of Human-Centric Functional Modeling (HCFM) to define functional state spaces within which human processes are well-defined paths, and within which products and services solve specific navigation problems, so that by considering all of any given individual’s desired paths through a given state space, it is possible to automate the customization of those products and services for that individual or to groups of individuals.METHODS: An analysis is performed to assess how and whether intelligent agents based on some subset of functionality required for Artificial General Intelligence (AGI) might be used to optimize for the individual user. And an analysis is performed to determine whether and if so how General Collective Intelligence (GCI) might be used to optimize across all users.RESULTS: AGI and GCI create the possibility to individualize products and services, even shared services such as the Internet, or news services so that every individual sees a different version.CONCLUSION: The conceptual example of customizing a news media website for two individual users of opposite political persuasions suggests that while the overhead of customizing such services might potentially result in massively increased storage and processing overhead, within a network of cooperating services in which this customization reliably creates value, this is potentially a significant opportunity.


2020 ◽  
Vol 11 (1) ◽  
pp. 70-85
Author(s):  
Samuel Allen Alexander

AbstractAfter generalizing the Archimedean property of real numbers in such a way as to make it adaptable to non-numeric structures, we demonstrate that the real numbers cannot be used to accurately measure non-Archimedean structures. We argue that, since an agent with Artificial General Intelligence (AGI) should have no problem engaging in tasks that inherently involve non-Archimedean rewards, and since traditional reinforcement learning rewards are real numbers, therefore traditional reinforcement learning probably will not lead to AGI. We indicate two possible ways traditional reinforcement learning could be altered to remove this roadblock.


2011 ◽  
Vol 2011 ◽  
pp. 1-12 ◽  
Author(s):  
Karim El-Laithy ◽  
Martin Bogdan

An integration of both the Hebbian-based and reinforcement learning (RL) rules is presented for dynamic synapses. The proposed framework permits the Hebbian rule to update the hidden synaptic model parameters regulating the synaptic response rather than the synaptic weights. This is performed using both the value and the sign of the temporal difference in the reward signal after each trial. Applying this framework, a spiking network with spike-timing-dependent synapses is tested to learn the exclusive-OR computation on a temporally coded basis. Reward values are calculated with the distance between the output spike train of the network and a reference target one. Results show that the network is able to capture the required dynamics and that the proposed framework can reveal indeed an integrated version of Hebbian and RL. The proposed framework is tractable and less computationally expensive. The framework is applicable to a wide class of synaptic models and is not restricted to the used neural representation. This generality, along with the reported results, supports adopting the introduced approach to benefit from the biologically plausible synaptic models in a wide range of intuitive signal processing.


Sign in / Sign up

Export Citation Format

Share Document