Multi-Agent Reinforcement Learning Based on K-Means Clustering in Multi-Robot Cooperative Systems

2011 ◽  
Vol 216 ◽  
pp. 75-80 ◽  
Author(s):  
Chang An Liu ◽  
Fei Liu ◽  
Chun Yang Liu ◽  
Hua Wu

To solve the curse of dimensionality problem in multi-agent reinforcement learning, a learning method based on k-means is presented in this paper. In this method, the environmental state is represented as key state factors. The state space explosion is avoided by classifying states into different clusters using k-means. The learning rate is improved by assigning different states to existent clusters, as well as corresponding strategy. Compared to traditional Q-learning, our experimental results of the multi-robot cooperation show that our scheme improves the team learning ability efficiently. Meanwhile, the cooperation efficiency can be enhanced successfully.

2015 ◽  
Vol 787 ◽  
pp. 843-847
Author(s):  
Leo Raju ◽  
R.S. Milton ◽  
S. Sakthiyanandan

In this paper, two solar Photovoltaic (PV) systems are considered; one in the department with capacity of 100 kW and the other in the hostel with capacity of 200 kW. Each one has battery and load. The capital cost and energy savings by conventional methods are compared and it is proved that the energy dependency from grid is reduced in solar micro-grid element, operating in distributed environment. In the smart grid frame work, the grid energy consumption is further reduced by optimal scheduling of the battery, using Reinforcement Learning. Individual unit optimization is done by a model free reinforcement learning method, called Q-Learning and it is compared with distributed operations of solar micro-grid using a Multi Agent Reinforcement Learning method, called Joint Q-Learning. The energy planning is designed according to the prediction of solar PV energy production and observed load pattern of department and the hostel. A simulation model was developed using Python programming.


2012 ◽  
Vol 588-589 ◽  
pp. 1515-1518
Author(s):  
Yong Song ◽  
Bing Liu ◽  
Yi Bin Li

Reinforcement learning algorithm for multi-robot may will become very slow when the number of robots is increasing resulting in an exponential increase of state space. A sequential Q-learning base on knowledge sharing is presented. The rule repository of robots behaviors is firstly initialized in the process of reinforcement learning. Mobile robots obtain present environmental state by sensors. Then the state will be matched to determine if the relevant behavior rule has been stored in database. If the rule is present, an action will be chosen in accordance with the knowledge and the rules, and the matching weight will be refined. Otherwise the new rule will be joined in the database. The robots learn according to a given sequence and share the behavior database. We examine the algorithm by multi-robot following-surrounding behavior, and find that the improved algorithm can effectively accelerate the convergence speed.


2012 ◽  
Vol 566 ◽  
pp. 572-579
Author(s):  
Abdolkarim Niazi ◽  
Norizah Redzuan ◽  
Raja Ishak Raja Hamzah ◽  
Sara Esfandiari

In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is proposed to increase the convergence rate of the reinforcement learning algorithms. RL algorithms are very useful for solving wide variety decision problems when their models are not available and they must make decision correctly in every state of system, such as multi agent systems, artificial control systems, robotic, tool condition monitoring and etc. In the propose method, we investigate how making improved action selection in reinforcement learning (RL) algorithm. In the proposed method, the new combined model using case base reasoning systems and a new optimized function is proposed to select the action, which led to an increase in algorithms based on Q-learning. The algorithm mentioned was used for solving the problem of cooperative Markov’s games as one of the models of Markov based multi-agent systems. The results of experiments Indicated that the proposed algorithms perform better than the existing algorithms in terms of speed and accuracy of reaching the optimal policy.


Author(s):  
Chao Yu ◽  
Yinzhao Dong ◽  
Yangning Li ◽  
Yatong Chen

2013 ◽  
Vol 823 ◽  
pp. 321-325
Author(s):  
Lu Jin ◽  
Yue Quan Yang ◽  
Chun Bo Ni ◽  
Zhi Qiang Cao ◽  
Yi Fei Kong

With the more robots, the information interaction of multi-robot system becomes more sophisticated and important in a community perception network environment. By exploiting and fusing the learning information of robots in a perception community, the community information sharing mechanism is proposed, as well as updating rules of the community Q-value table. Moreover, considering the existence of delays of learning information transmission, an improved Q-learning method based on homogeneous delays is presented to improve the robot learning efficiency over the community perception network. Finally, the test experiments demonstrate the effectiveness of the proposed scheme.


Sign in / Sign up

Export Citation Format

Share Document