Behavior is Reward-oriented

Author(s):  
Martin V. Butz ◽  
Esther F. Kutter

Delving further into development, adaptation, and learning, this chapter considers the potential of reward-oriented optimization of behavior. Reinforcement learning (RL) is motivated from the Rescorla–Wagner model in psychology and behaviorism. Next, a detailed introduction to RL in artificial systems is provided. It is shown when and how RL works, but also current shortcomings and challenges are discussed. In conclusion, the chapter emphasizes that behavioral optimization and reward-based behavioral adaptations can be well-accomplished with RL. However, to be able to solve more challenging planning problems and to enable flexible, goal-oriented behavior, hierarchically and modularly structured models about the environment are necessary. Such models then also enable the pursuance of abstract reasoning and of thoughts that are fully detached from the current environmental state. The challenge remains how such models may actually be learned and structured.

2021 ◽  
Vol 01 ◽  
Author(s):  
Ying Li ◽  
Chubing Guo ◽  
Jianshe Wu ◽  
Xin Zhang ◽  
Jian Gao ◽  
...  

Background: Unmanned systems have been widely used in multiple fields. Many algorithms have been proposed to solve path planning problems. Each algorithm has its advantages and defects and cannot adapt to all kinds of requirements. An appropriate path planning method is needed for various applications. Objective: To select an appropriate algorithm fastly in a given application. This could be helpful for improving the efficiency of path planning for Unmanned systems. Methods: This paper proposes to represent and quantify the features of algorithms based on the physical indicators of results. At the same time, an algorithmic collaborative scheme is developed to search the appropriate algorithm according to the requirement of the application. As an illustration of the scheme, four algorithms, including the A-star (A*) algorithm, reinforcement learning, genetic algorithm, and ant colony optimization algorithm, are implemented in the representation of their features. Results: In different simulations, the algorithmic collaborative scheme can select an appropriate algorithm in a given application based on the representation of algorithms. And the algorithm could plan a feasible and effective path. Conclusion: An algorithmic collaborative scheme is proposed, which is based on the representation of algorithms and requirement of the application. The simulation results prove the feasibility of the scheme and the representation of algorithms.


2018 ◽  
Vol 48 (12) ◽  
pp. 4889-4904 ◽  
Author(s):  
Xingyu Zhao ◽  
Shifei Ding ◽  
Yuexuan An ◽  
Weikuan Jia

Author(s):  
Dieqiao Feng ◽  
Carla Gomes ◽  
Bart Selman

Despite significant progress in general AI planning, certain domains remain out of reach of current AI planning systems. Sokoban is a PSPACE-complete planning task and represents one of the hardest domains for current AI planners. Even domain-specific specialized search methods fail quickly due to the exponential search complexity on hard instances. Our approach based on deep reinforcement learning augmented with a curriculum-driven method is the first one to solve hard instances within one day of training while other modern solvers cannot solve these instances within any reasonable time limit. In contrast to prior efforts, which use carefully handcrafted pruning techniques, our approach automatically uncovers domain structure. Our results reveal that deep RL provides a promising framework for solving previously unsolved AI planning problems, provided a proper training curriculum can be devised.


Author(s):  
László Orgován ◽  
Tamás Bécsi ◽  
Szilárd Aradi

Autonomous vehicles or self-driving cars are prevalent nowadays, many vehicle manufacturers, and other tech companies are trying to develop autonomous vehicles. One major goal of the self-driving algorithms is to perform manoeuvres safely, even when some anomaly arises. To solve these kinds of complex issues, Artificial Intelligence and Machine Learning methods are used. One of these motion planning problems is when the tires lose their grip on the road, an autonomous vehicle should handle this situation. Thus the paper provides an Autonomous Drifting algorithm using Reinforcement Learning. The algorithm is based on a model-free learning algorithm, Twin Delayed Deep Deterministic Policy Gradients (TD3). The model is trained on six different tracks in a simulator, which is developed specifically for autonomous driving systems; namely CARLA.


Author(s):  
Jun Hao Alvin Ng ◽  
Ronald P. A. Petrick

The soundness and optimality of a plan depends on the correctness of the domain model. Specifying complete domain models can be difficult when interactions between an agent and its environment are complex. We propose a model-based reinforcement learning (MBRL) approach to solve planning problems with unknown models. The model is learned incrementally over episodes using only experiences from the current episode which suits non-stationary environments. We introduce the novel concept of reliability as an intrinsic motivation for MBRL, and a method to learn from failure to prevent repeated instances of similar failures. Our motivation is to improve the learning efficiency and goal-directedness of MBRL. We evaluate our work with experimental results for three planning domains.


Sign in / Sign up

Export Citation Format

Share Document