DIFFERENT LEARNING METHODOLOGIES FOR VISION-BASED NAVIGATION BEHAVIORS

In this work the complex behavior of localizing a mobile vehicle with respect to the door of the environment and then reaching the door has been developed. The robot uses visual information to detect and recognize the door and to determine its state with respect to it. This complex task has been divided into two separate behaviors: door-recognition and door-reaching. A supervised methodology based on learning by components has been applied for recognizing the door. Learning by components allows to recognize the door also in difficult situations such as partial occlusions and besides, it makes recognition independent of viewpoint variations and scale changes. An unsupervised methodology based on reinforcement learning has been used for the door-reaching behavior, instead. The image of the door gives information about the relative position of the vehicle with respect to the door. Then the Q-learning algorithm is used to generate the optimal state-action associations. The problem of defining the state and the action sets has been addressed with the aim of producing smooth paths, of reducing the effects of visual errors during real navigation, and of keeping low the computational cost during the learning phase. A novel way to obtain a continuous action set has been introduced: it uses a fuzzy model to evaluate the system state. Experimental results in real environment show both the robustness of the door-recognition behavior and the generality of the door-reaching behavior.

Download Full-text

Adaptive Object Tracking via Multi-Angle Analysis Collaboration

Sensors ◽

10.3390/s18113606 ◽

2018 ◽

Vol 18 (11) ◽

pp. 3606 ◽

Cited By ~ 1

Author(s):

Wanli Xue ◽

Zhiyong Feng ◽

Chao Xu ◽

Zhaopeng Meng ◽

Chengwei Zhang

Keyword(s):

Object Tracking ◽

Learning Algorithm ◽

Action Space ◽

Selection Strategy ◽

Multiple Perspectives ◽

Strategic Framework ◽

Practical Applications ◽

Q Learning ◽

State Action ◽

Speed And Accuracy

Although tracking research has achieved excellent performance in mathematical angles, it is still meaningful to analyze tracking problems from multiple perspectives. This motivation not only promotes the independence of tracking research but also increases the flexibility of practical applications. This paper presents a significant tracking framework based on the multi-dimensional state–action space reinforcement learning, termed as multi-angle analysis collaboration tracking (MACT). MACT is comprised of a basic tracking framework and a strategic framework which assists the former. Especially, the strategic framework is extensible and currently includes feature selection strategy (FSS) and movement trend strategy (MTS). These strategies are abstracted from the multi-angle analysis of tracking problems (observer’s attention and object’s motion). The content of the analysis corresponds to the specific actions in the multidimensional action space. Concretely, the tracker, regarded as an agent, is trained with Q-learning algorithm and ϵ -greedy exploration strategy, where we adopt a customized rewarding function to encourage robust object tracking. Numerous contrast experimental evaluations on the OTB50 benchmark demonstrate the effectiveness of the strategies and improvement in speed and accuracy of MACT tracker.

Download Full-text

Path Planning Collision Avoidance using Reinforcement Learning

10.48011/asba.v2i1.1597 ◽

2020 ◽

Author(s):

Josias G. Batista ◽

Felipe J. S. Vasconcelos ◽

Kaio M. Ramos ◽

Darielson A. Souza ◽

José L. N. Silva

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Production Process ◽

Collision Avoidance ◽

Production Systems ◽

Learning Algorithm ◽

Computational Cost ◽

Trajectory Generation ◽

Industrial Robots ◽

Q Learning

Industrial robots have grown over the years making production systems more and more efficient, requiring the need for efficient trajectory generation algorithms that optimize and, if possible, generate collision-free trajectories without interrupting the production process. In this work is presented the use of Reinforcement Learning (RL), based on the Q-Learning algorithm, in the trajectory generation of a robotic manipulator and also a comparison of its use with and without constraints of the manipulator kinematics, in order to generate collisionfree trajectories. The results of the simulations are presented with respect to the efficiency of the algorithm and its use in trajectory generation, a comparison of the computational cost for the use of constraints is also presented.

Download Full-text

A LEARNING ALGORITHM FOR DISCRETE-TIME STOCHASTIC CONTROL

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800142081 ◽

2000 ◽

Vol 14 (2) ◽

pp. 243-258 ◽

Cited By ~ 9

Author(s):

V. S. Borkar

Keyword(s):

Stochastic Control ◽

Discrete Time ◽

Learning Algorithm ◽

Control Process ◽

Almost Sure Convergence ◽

Discrete State ◽

Q Learning ◽

State Action ◽

Simulation Based ◽

Action Spaces

A simulation-based algorithm for learning good policies for a discrete-time stochastic control process with unknown transition law is analyzed when the state and action spaces are compact subsets of Euclidean spaces. This extends the Q-learning scheme of discrete state/action problems along the lines of Baker [4]. Almost sure convergence is proved under suitable conditions.

Download Full-text

Multi-agent cooperation Q-learning algorithm based on constrained Markov Game

Computer Science and Information Systems ◽

10.2298/csis191220009g ◽

2020 ◽

Vol 17 (2) ◽

pp. 647-664

Author(s):

Yangyang Ge ◽

Fei Zhu ◽

Wei Huang ◽

Peiyao Zhao ◽

Quan Liu

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Multi Agent System ◽

Agent System ◽

Action Function ◽

Q Learning ◽

State Action ◽

Markov Game ◽

Safety Constraints ◽

Multi Agent

Multi-Agent system has broad application in real world, whose security performance, however, is barely considered. Reinforcement learning is one of the most important methods to resolve Multi-Agent problems. At present, certain progress has been made in applying Multi-Agent reinforcement learning to robot system, man-machine match, and automatic, etc. However, in the above area, an agent may fall into unsafe states where the agent may find it difficult to bypass obstacles, to receive information from other agents and so on. Ensuring the safety of Multi-Agent system is of great importance in the above areas where an agent may fall into dangerous states that are irreversible, causing great damage. To solve the safety problem, in this paper we introduce a Multi-Agent Cooperation Q-Learning Algorithm based on Constrained Markov Game. In this method, safety constraints are added to the set of actions, and each agent, when interacting with the environment to search for optimal values, should be restricted by the safety rules, so as to obtain an optimal policy that satisfies the security requirements. Since traditional Multi-Agent reinforcement learning algorithm is no more suitable for the proposed model in this paper, a new solution is introduced for calculating the global optimum state-action function that satisfies the safety constraints. We take advantage of the Lagrange multiplier method to determine the optimal action that can be performed in the current state based on the premise of linearizing constraint functions, under conditions that the state-action function and the constraint function are both differentiable, which not only improves the efficiency and accuracy of the algorithm, but also guarantees to obtain the global optimal solution. The experiments verify the effectiveness of the algorithm.

Download Full-text

Neural Q-Learning Based Mobile Robot Navigation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.721 ◽

2012 ◽

Vol 433-440 ◽

pp. 721-726

Author(s):

Soh Chin Yun ◽

S. Parasuraman ◽

Velappa Ganapathy ◽

Halim Kusuma Joe

Keyword(s):

Neural Network ◽

Learning Algorithm ◽

Mobile Robot Navigation ◽

Matlab Simulation ◽

Training Process ◽

Q Learning ◽

State Action ◽

Training Samples ◽

Artificial Neural Network Ann ◽

Trained Neural Network

This research is focused on the integration of multi-layer Artificial Neural Network (ANN) and Q-Learning to perform online learning control. In the first learning phase, the agent explores the unknown surroundings and gathers state-action information through the unsupervised Q-Learning algorithm. Second training process involves ANN which utilizes the state-action information gathered in the earlier phase of training samples. During final application of the controller, Q-Learning would be used as primary navigating tool whereas the trained Neural Network will be employed when approximation is needed. MATLAB simulation was developed to verify and the algorithm was validated in real-time using Team AmigoBotTM robot. The results obtained from both simulation and real world experiments are discussed.

Download Full-text

Normative Rule Extraction from Implicit Learning into Explicit Representation

Knowledge Innovation Through Intelligent Software Methodologies, Tools and Techniques - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200555 ◽

2020 ◽

Author(s):

Mohd Rashdan Abdul Kadir ◽

Ali Selamat ◽

Ondrej Krejcar

Keyword(s):

Implicit Learning ◽

Learning Algorithm ◽

Rule Extraction ◽

Coordination Mechanism ◽

Autonomous Agent ◽

Q Learning ◽

State Action ◽

Speed Up ◽

Multi Agent ◽

Normative Rule

Normative multi-agent research is an alternative viewpoint in the design of adaptive autonomous agent architecture. Norms specify the standards of behaviors such as which actions or states should be achieved or avoided. The concept of norm synthesis is the process of generating useful normative rules. This study proposes a model for normative rule extraction from implicit learning, namely using the Q-learning algorithm, into explicit norm representation by implementing Dynamic Deontics and Hierarchical Knowledge Base (HKB) to synthesize useful normative rules in the form of weighted state-action pairs with deontic modality. OpenAi Gym is used to simulate the agent environment. Our proposed model is able to generate both obligative and prohibitive norms as well as deliberate and execute said norms. Results show the generated norms are best used as prior knowledge to guide agent behavior and performs poorly if not complemented by another agent coordination mechanism. Performance increases when using both obligation and prohibition norms, and in general, norms do speed up optimum policy reachability.

Download Full-text

The New Geometric “State-Action” Space Representation for Q-Learning Algorithm for Protein Structure Folding Problem

Cybernetics and Computer Technologies ◽

10.34229/2707-451x.20.3.6 ◽

2020 ◽

pp. 59-73

Author(s):

S. Chornozhuk

Keyword(s):

Protein Structure ◽

State Space ◽

Learning Algorithm ◽

Action Space ◽

Space Representation ◽

Q Learning ◽

State Action ◽

State Space Representation ◽

Advantages And Disadvantages ◽

Learning Techniques

Introduction. The spatial protein structure folding is an important and actual problem in computational biology. Considering the mathematical model of the task, it can be easily concluded that finding an optimal protein conformation in a three dimensional grid is a NP-hard problem. Therefore some reinforcement learning techniques such as Q-learning approach can be used to solve the problem. The article proposes a new geometric “state-action” space representation which significantly differs from all alternative representations used for this problem. The purpose of the article is to analyze existing approaches of different states and actions spaces representations for Q-learning algorithm for protein structure folding problem, reveal their advantages and disadvantages and propose the new geometric “state-space” representation. Afterwards the goal is to compare existing and the proposed approaches, make conclusions with also describing possible future steps of further research. Result. The work of the proposed algorithm is compared with others on the basis of 10 known chains with a length of 48 first proposed in [16]. For each of the chains the Q-learning algorithm with the proposed “state-space” representation outperformed the same Q-learning algorithm with alternative existing “state-space” representations both in terms of average and minimal energy values of resulted conformations. Moreover, a plenty of existing representations are used for a 2D protein structure predictions. However, during the experiments both existing and proposed representations were slightly changed or developed to solve the problem in 3D, which is more computationally demanding task. Conclusion. The quality of the Q-learning algorithm with the proposed geometric “state-action” space representation has been experimentally confirmed. Consequently, it’s proved that the further research is promising. Moreover, several steps of possible future research such as combining the proposed approach with deep learning techniques has been already suggested. Keywords: Spatial protein structure, combinatorial optimization, relative coding, machine learning, Q-learning, Bellman equation, state space, action space, basis in 3D space.

Download Full-text