Autonomous Hexapod Robot With Artificial Vision and Remote Control by Myo-Electric Gestures

Author(s):  
Valentina Franzoni

The robot gAItano is an intelligent hexapod robot, able to move in an environment of unknown size and perform some autonomous actions. It uses the RoboRealm software in order to filter and recognize color blobs in its artificial vision stream, activate a script (VBScript in our case, or C or Python scripts) to compute decisions based on perception, and send the output to actuators using the PIP protocol. gAItano is thus a rational computerized agent: autonomous, or semi-autonomous when remote controlled; reactive; based on model (e.g., the line). gAItano moves in an environment which is partially observable, stochastic, semi-episodic, static, or semi-dynamic in case of human intervention, continuous both on perceptions and actions, multi-agent, because of human intervention that can have collaborative nature (e.g., when the human moves a block or the robot to increase his performance), or competitive (e.g., when the human moves a block or the robot to inhibit his performance).

Author(s):  
Yanlin Han ◽  
Piotr Gmytrasiewicz

This paper introduces the IPOMDP-net, a neural network architecture for multi-agent planning under partial observability. It embeds an interactive partially observable Markov decision process (I-POMDP) model and a QMDP planning algorithm that solves the model in a neural network architecture. The IPOMDP-net is fully differentiable and allows for end-to-end training. In the learning phase, we train an IPOMDP-net on various fixed and randomly generated environments in a reinforcement learning setting, assuming observable reinforcements and unknown (randomly initialized) model functions. In the planning phase, we test the trained network on new, unseen variants of the environments under the planning setting, using the trained model to plan without reinforcements. Empirical results show that our model-based IPOMDP-net outperforms the other state-of-the-art modelfree network and generalizes better to larger, unseen environments. Our approach provides a general neural computing architecture for multi-agent planning using I-POMDPs. It suggests that, in a multi-agent setting, having a model of other agents benefits our decision-making, resulting in a policy of higher quality and better generalizability.


Author(s):  
Yu. V. Dubenko

This paper is devoted to the problem of collective artificial intelligence in solving problems by intelligent agents in external environments. The environments may be: fully or partially observable, deterministic or stochastic, static or dynamic, discrete or continuous. The paper identifies problems of collective interaction of intelligent agents when they solve a class of tasks, which need to coordinate actions of agent group, e. g. task of exploring the territory of a complex infrastructure facility. It is revealed that the problem of reinforcement training in multi-agent systems is poorly presented in the press, especially in Russian-language publications. The article analyzes reinforcement learning, describes hierarchical reinforcement learning, presents basic methods to implement reinforcement learning. The concept of macro-action by agents integrated in groups is introduced. The main problems of intelligent agents collective interaction for problem solving (i. e. calculation of individual rewards for each agent; agent coordination issues; application of macro actions by agents integrated into groups; exchange of experience generated by various agents as part of solving a collective problem) are identified. The model of multi-agent reinforcement learning is described in details. The article describes problems of this approach building on existing solutions. Basic problems of multi-agent reinforcement learning are formulated in conclusion.


Author(s):  
Madison Clark-Turner ◽  
Christopher Amato

The decentralized partially observable Markov decision process (Dec-POMDP) is a powerful model for representing multi-agent problems with decentralized behavior. Unfortunately, current Dec-POMDP solution methods cannot solve problems with continuous observations, which are common in many real-world domains. To that end, we present a framework for representing and generating Dec-POMDP policies that explicitly include continuous observations. We apply our algorithm to a novel tagging problem and an extended version of a common benchmark, where it generates policies that meet or exceed the values of equivalent discretized domains without the need for finding an adequate discretization.


2005 ◽  
Vol 24 ◽  
pp. 49-79 ◽  
Author(s):  
P. J. Gmytrasiewicz ◽  
P. Doshi

This paper extends the framework of partially observable Markov decision processes (POMDPs) to multi-agent settings by incorporating the notion of agent models into the state space. Agents maintain beliefs over physical states of the environment and over models of other agents, and they use Bayesian updates to maintain their beliefs over time. The solutions map belief states to actions. Models of other agents may include their belief states and are related to agent types considered in games of incomplete information. We express the agents' autonomy by postulating that their models are not directly manipulable or observable by other agents. We show that important properties of POMDPs, such as convergence of value iteration, the rate of convergence, and piece-wise linearity and convexity of the value functions carry over to our framework. Our approach complements a more traditional approach to interactive settings which uses Nash equilibria as a solution paradigm. We seek to avoid some of the drawbacks of equilibria which may be non-unique and do not capture off-equilibrium behaviors. We do so at the cost of having to represent, process and continuously revise models of other agents. Since the agent's beliefs may be arbitrarily nested, the optimal solutions to decision making problems are only asymptotically computable. However, approximate belief updates and approximately optimal plans are computable. We illustrate our framework using a simple application domain, and we show examples of belief updates and value functions.


2018 ◽  
Vol 26 (6) ◽  
pp. 285-307
Author(s):  
Giordano BS Ferreira ◽  
Matthias Scheutz

Accidents happen in nature, from simple incidents like bumping into obstacles, to erroneously arriving at the wrong location, to mating with an unintended partner. Whether accidents are problematic for an animal depends on their context, frequency, and severity. In this article, we investigate the question of how accidents affect the task performance of agents in an agent-based simulation model for a wide class of tasks called “multi-agent territory exploration” tasks (MATE). In MATE tasks, agents have to visit particular locations of varying quality in partially observable environments within a fixed time window. As such, agents have to balance the quality of the location with how much energy they are willing to expend reaching it. Arriving at the wrong location by accident typically reduces task performance. We model agents based on two location selection strategies that are hypothesized to be widely used in nature: best-of-n and min-threshold. Our results show that the two strategies lead to different accident rates and thus overall different levels of performance based on the degree of competition among agents, as well as the quality, density, visibility, and distribution of target locations in the environment. We also show that in some cases, individual accidents can be advantageous for both the individual and the whole group.


Sign in / Sign up

Export Citation Format

Share Document