scholarly journals Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention

Author(s):  
Abhishek Gupta ◽  
Justin Yu ◽  
Tony Z. Zhao ◽  
Vikash Kumar ◽  
Aaron Rovinsky ◽  
...  
Author(s):  
Abdelghafour Harraz ◽  
Mostapha Zbakh

Artificial Intelligence allows to create engines that are able to explore, learn environments and therefore create policies that permit to control them in real time with no human intervention. It can be applied, through its Reinforcement Learning techniques component, using frameworks such as temporal differences, State-Action-Reward-State-Action (SARSA), Q Learning to name a few, to systems that are be perceived as a Markov Decision Process, this opens door in front of applying Reinforcement Learning to Cloud Load Balancing to be able to dispatch load dynamically to a given Cloud System. The authors will describe different techniques that can used to implement a Reinforcement Learning based engine in a cloud system.


2020 ◽  
Vol 34 (03) ◽  
pp. 2561-2568
Author(s):  
Morgane Ayle ◽  
Jimmy Tekli ◽  
Julia El-Zini ◽  
Boulos El-Asmar ◽  
Mariette Awad

Research has shown that deep neural networks are able to help and assist human workers throughout the industrial sector via different computer vision applications. However, such data-driven learning approaches require a very large number of labeled training images in order to generalize well and achieve high accuracies that meet industry standards. Gathering and labeling large amounts of images is both expensive and time consuming, specifically for industrial use-cases. In this work, we introduce BAR (Bounding-box Automated Refinement), a reinforcement learning agent that learns to correct inaccurate bounding-boxes that are weakly generated by certain detection methods, or wrongly annotated by a human, using either an offline training method with Deep Reinforcement Learning (BAR-DRL), or an online one using Contextual Bandits (BAR-CB). Our agent limits the human intervention to correcting or verifying a subset of bounding-boxes instead of re-drawing new ones. Results on a car industry-related dataset and on the PASCAL VOC dataset show a consistent increase of up to 0.28 in the Intersection-over-Union of bounding-boxes with their desired ground-truths, while saving 30%-82% of human intervention time in either correcting or re-drawing inaccurate proposals.


Author(s):  
Naaima Suroor ◽  
Imran Hussain ◽  
Aqeel Khalique ◽  
Tabrej Ahamad Khan

Reinforcement learning is a flourishing machine learning concept that has greatly influenced how robots are designed and taught to solve problems without human intervention. Robotics is not an alien discipline anymore, and we have several great innovations in this field that promise to impact lives for the better. However, humanoid robots are still a baffling concept for scientists, although we have managed to develop a few great inventions which look, talk, work, and behave very similarly to humans. But, can these machines actually exhibit the cognitive abilities of judgment, problem-solving, and perception as well as humans? In this article, the authors analyzed the probable impact and aspects of robots and their potential to behave like humans in every possible way through reinforcement learning techniques. The paper also discusses the gap between 'natural' and 'artificial' knowledge.


Author(s):  
Aravind Rajeswaran ◽  
Vikash Kumar ◽  
Abhishek Gupta ◽  
Giulia Vezzani ◽  
John Schulman ◽  
...  

10.29007/g7bg ◽  
2019 ◽  
Author(s):  
João Ribeiro ◽  
Francisco Melo ◽  
João Dias

In this paper we investigate two hypothesis regarding the use of deep reinforcement learning in multiple tasks. The first hypothesis is driven by the question of whether a deep reinforcement learning algorithm, trained on two similar tasks, is able to outperform two single-task, individually trained algorithms, by more efficiently learning a new, similar task, that none of the three algorithms has encountered before. The second hypothesis is driven by the question of whether the same multi-task deep RL algorithm, trained on two similar tasks and augmented with elastic weight consolidation (EWC), is able to retain similar performance on the new task, as a similar algorithm without EWC, whilst being able to overcome catastrophic forgetting in the two previous tasks. We show that a multi-task Asynchronous Advantage Actor-Critic (GA3C) algorithm, trained on Space Invaders and Demon Attack, is in fact able to outperform two single-tasks GA3C versions, trained individually for each single-task, when evaluated on a new, third task—namely, Phoenix. We also show that, when training two trained multi-task GA3C algorithms on the third task, if one is augmented with EWC, it is not only able to achieve similar performance on the new task, but also capable of overcoming a substantial amount of catastrophic forgetting on the two previous tasks.


Sign in / Sign up

Export Citation Format

Share Document