One to Any: Distributed Conflict Resolution with Deep Multi-Agent Reinforcement Learning and Long Short-Term Memory

2021 ◽  
Author(s):  
Marc W. Brittain ◽  
Peng Wei
2019 ◽  
Vol 1 (2) ◽  
pp. 74-84
Author(s):  
Evan Kusuma Susanto ◽  
Yosi Kristian

Asynchronous Advantage Actor-Critic (A3C) adalah sebuah algoritma deep reinforcement learning yang dikembangkan oleh Google DeepMind. Algoritma ini dapat digunakan untuk menciptakan sebuah arsitektur artificial intelligence yang dapat menguasai berbagai jenis game yang berbeda melalui trial and error dengan mempelajari tempilan layar game dan skor yang diperoleh dari hasil tindakannya tanpa campur tangan manusia. Sebuah network A3C terdiri dari Convolutional Neural Network (CNN) di bagian depan, Long Short-Term Memory Network (LSTM) di tengah, dan sebuah Actor-Critic network di bagian belakang. CNN berguna sebagai perangkum dari citra output layar dengan mengekstrak fitur-fitur yang penting yang terdapat pada layar. LSTM berguna sebagai pengingat keadaan game sebelumnya. Actor-Critic Network berguna untuk menentukan tindakan terbaik untuk dilakukan ketika dihadapkan dengan suatu kondisi tertentu. Dari hasil percobaan yang dilakukan, metode ini cukup efektif dan dapat mengalahkan pemain pemula dalam memainkan 5 game yang digunakan sebagai bahan uji coba.


2020 ◽  
Vol 271 ◽  
pp. 114945
Author(s):  
Xiangyu Kong ◽  
Deqian Kong ◽  
Jingtao Yao ◽  
Linquan Bai ◽  
Jie Xiao

2021 ◽  
Vol 9 ◽  
Author(s):  
R. Lakshmana Kumar ◽  
Firoz Khan ◽  
Sadia Din ◽  
Shahab S. Band ◽  
Amir Mosavi ◽  
...  

Detection and prediction of the novel Coronavirus present new challenges for the medical research community due to its widespread across the globe. Methods driven by Artificial Intelligence can help predict specific parameters, hazards, and outcomes of such a pandemic. Recently, deep learning-based approaches have proven a novel opportunity to determine various difficulties in prediction. In this work, two learning algorithms, namely deep learning and reinforcement learning, were developed to forecast COVID-19. This article constructs a model using Recurrent Neural Networks (RNN), particularly the Modified Long Short-Term Memory (MLSTM) model, to forecast the count of newly affected individuals, losses, and cures in the following few days. This study also suggests deep learning reinforcement to optimize COVID-19's predictive outcome based on symptoms. Real-world data was utilized to analyze the success of the suggested system. The findings show that the established approach promises prognosticating outcomes concerning the current COVID-19 pandemic and outperformed the Long Short-Term Memory (LSTM) model and the Machine Learning model, Logistic Regresion (LR) in terms of error rate.


2021 ◽  
Author(s):  
Jiachen Yao ◽  
Baochun Lu ◽  
Junli Zhang

Abstract Tool wear and faults will affect the quality of machined workpiece and damage the continuity of manufacturing. The accurate prediction of remaining useful life (RUL) is significant to guarantee processing quality and improve productivity of automatic system. At present, the most methods for tool RUL prediction are trained by history fault data. However, when researching on new types of tools or processing high value parts, fault datasets are difficult to acquired, which led to RUL prediction a challenge under limited fault data. To overcome shortcomings of above prediction methods, a deep transfer reinforcement learning (DTRL) network based on long short term memory (LSTM) network is presented in this paper. Local features are extracted from consecutive sensor data to track the tool states, and the trained network size can be dynamically adjusted by controlling time sequence length. Then in DTRL network, LSTM network is employed to construct the value function approximation for smoothly processing temporal information and mining long-term dependencies. On this basis, a novel strategies of Q-function update and transfer are presented to transfer the DRL network trained by historical fault data to a new tool for RUL prediction. Finally, tool wear experiments are performed to validate effectiveness of the DTRL model. The prediction result demonstrate that the proposed method has high accuracy and generalization for similar tools and cutting conditions.


2022 ◽  
Vol 33 (1) ◽  
pp. 1-19
Author(s):  
S. Lakshmi Durga ◽  
Ch. Rajeshwari ◽  
Khalid Hamed Allehaibi ◽  
Nishu Gupta ◽  
Nasser Nammas Albaqami ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document