scholarly journals Towards Realizing Intelligent Coordinated Controllers for Multi-USV Systems Using Abstract Training Environments

2021 ◽  
Vol 9 (6) ◽  
pp. 560
Author(s):  
Sulemana Nantogma ◽  
Keyu Pan ◽  
Weilong Song ◽  
Renwei Luo ◽  
Yang Xu

Unmanned autonomous vehicles for various civilian and military applications have become a particularly interesting research area. Despite their many potential applications, a related technological challenge is realizing realistic coordinated autonomous control and decision making in complex and multi-agent environments. Machine learning approaches have been largely employed in simplified simulations to acquire intelligent control systems in multi-agent settings. However, the complexity of the physical environment, unrealistic assumptions, and lack of abstract physical environments derail the process of transition from simulation to real systems. This work presents a modular framework for automated data acquisition, training, and the evaluation of multiple unmanned surface vehicles controllers that facilitate prior knowledge integration and human-guided learning in a closed-loop. To realize this, we first present a digital maritime environment of multiple unmanned surface vehicles that abstracts the real-world dynamics in our application domain. Then, a behavior-driven artificial immune-inspired fuzzy classifier systems approach that is capable of optimizing agents’ behaviors and action selection in a multi-agent environment is presented. Evaluation scenarios of different combat missions are presented to demonstrate the performance of the system. Simulation results show that the resulting controllers can achieved an average wining rate between 52% and 98% in all test cases, indicating the effectiveness of the proposed approach and its feasibility in realizing adaptive controllers for efficient multiple unmanned systems’ cooperative decision making. We believe that this system can facilitate the simulation, data acquisition, training, and evaluation of practical cooperative unmanned vehicles’ controllers in a closed-loop.

2001 ◽  
Vol 136 (1-4) ◽  
pp. 215-239 ◽  
Author(s):  
Andrea Bonarini ◽  
Vito Trianni

2021 ◽  
Author(s):  
Arthur Campbell

Abstract An important task for organizations is establishing truthful communication between parties with differing interests. This task is made particularly challenging when the accuracy of the information is poorly observed or not at all. In these settings, incentive contracts based on the accuracy of information will not be very effective. This paper considers an alternative mechanism that does not require any signal of the accuracy of any information communicated to provide incentives for truthful communication. Rather, an expert sacrifices future participation in decision-making to influence the current period’s decision in favour of their preferred project. This mechanism captures a notion often described as ‘political capital’ whereby an individual is able to achieve their own preferred decision in the current period at the expense of being able to exert influence in future decisions (‘spending political capital’). When the first-best is not possible in this setting, I show that experts hold more influence than under the first-best and that, in a multi-agent extension, a finite team size is optimal. Together these results suggest that a small number of individuals hold excessive influence in organizations.


J ◽  
2021 ◽  
Vol 4 (2) ◽  
pp. 147-153
Author(s):  
Paula Morella ◽  
María Pilar Lambán ◽  
Jesús Antonio Royo ◽  
Juan Carlos Sánchez

Among the new trends in technology that have emerged through the Industry 4.0, Cyber Physical Systems (CPS) and Internet of Things (IoT) are crucial for the real-time data acquisition. This data acquisition, together with its transformation in valuable information, are indispensable for the development of real-time indicators. Moreover, real-time indicators provide companies with a competitive advantage over the competition since they enhance the calculus and speed up the decision-making and failure detection. Our research highlights the advantages of real-time data acquisition for supply chains, developing indicators that would be impossible to achieve with traditional systems, improving the accuracy of the existing ones and enhancing the real-time decision-making. Moreover, it brings out the importance of integrating technologies 4.0 in industry, in this case, CPS and IoT, and establishes the main points for a future research agenda of this topic.


2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.


Symmetry ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 631
Author(s):  
Chunyang Hu

In this paper, deep reinforcement learning (DRL) and knowledge transfer are used to achieve the effective control of the learning agent for the confrontation in the multi-agent systems. Firstly, a multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm with parameter sharing is proposed to achieve confrontation decision-making of multi-agent. In the process of training, the information of other agents is introduced to the critic network to improve the strategy of confrontation. The parameter sharing mechanism can reduce the loss of experience storage. In the DDPG algorithm, we use four neural networks to generate real-time action and Q-value function respectively and use a momentum mechanism to optimize the training process to accelerate the convergence rate for the neural network. Secondly, this paper introduces an auxiliary controller using a policy-based reinforcement learning (RL) method to achieve the assistant decision-making for the game agent. In addition, an effective reward function is used to help agents balance losses of enemies and our side. Furthermore, this paper also uses the knowledge transfer method to extend the learning model to more complex scenes and improve the generalization of the proposed confrontation model. Two confrontation decision-making experiments are designed to verify the effectiveness of the proposed method. In a small-scale task scenario, the trained agent can successfully learn to fight with the competitors and achieve a good winning rate. For large-scale confrontation scenarios, the knowledge transfer method can gradually improve the decision-making level of the learning agent.


Sign in / Sign up

Export Citation Format

Share Document