A Visual Grasping Strategy for Improving Assembly Efficiency Based on Deep Reinforcement Learning

The adjustment times of the attitude alignment are fluctuated due to the fluctuation of the contact force signal caused by the disturbing moments in the compliant peg-in-hole assembly. However, these fluctuations are difficult to accurately measure or definition as a result of many uncertain factors in the working environment. It is worth noting that gravitational disturbing moments and inertia moments significantly impact these fluctuations, in which the changes of the peg concerning the mass and the length have a crucial influence on them. In this paper, a visual grasping strategy based on deep reinforcement learning is proposed for peg-in-hole assembly. Firstly, the disturbing moments of assembly are analyzed to investigate the factors for the fluctuation of assembly time. Then, this research designs a visual grasping strategy, which establishes a mapping relationship between the grasping position and the assembly time to improve the assembly efficiency. Finally, a robotic system for the assembly was built in V-REP to verify the effectiveness of the proposed method, and the robot can complete the training independently without human intervention and manual labeling in the grasping training process. The simulated results show that this method can improve assembly efficiency by 13.83%. And, when the mass and the length of the peg change, the proposed method is still effective for the improvement of assembly efficiency.

Download Full-text

Digital Tools as an Enabler for Educational and Training Processes: The Case Study of REFUGEEClassAssistance4 Teachers Project

Proceedings ◽

10.3390/proceedings2021074014 ◽

2021 ◽

Vol 74 (1) ◽

pp. 14

Author(s):

Ourania Areta ◽

Karel Van Isacker

Keyword(s):

Social Interactions ◽

Education And Training ◽

Working Environment ◽

Digital Tools ◽

Training Process ◽

Training Activities ◽

And Training

Digitalization has transformed all aspects of life, from social interactions to the working environment and education, something that accelerated with the emergence of COVID-19. The same stands for education and training activities, where the use of digital tools has been gradually advancing and become merely online because of the virus. This brought forth the need to discuss further the applications, benefits, and challenges of digital tools within the framework of the education and training process, and the need to study examples of successful applications. This study aims to support both these requirements by presenting the case study of REFUGEEClassAssistance4Teachers project and its outcomes.

Download Full-text

Diversity oriented Deep Reinforcement Learning for targeted molecule generation

Journal of Cheminformatics ◽

10.1186/s13321-021-00498-z ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Tiago Pereira ◽

Maryam Abbasi ◽

Bernardete Ribeiro ◽

Joel P. Arrais

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Reinforcement Learning ◽

Deep Neural Networks ◽

Chemical Space ◽

Biological Properties ◽

Training Process ◽

Training Strategy ◽

Inhibitory Power ◽

Exploratory Strategy

AbstractIn this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine $$A_{2A}$$ A 2 A and $$\kappa$$ κ opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.

Download Full-text

Robot obstacle avoidance system using deep reinforcement learning

Industrial Robot the international journal of robotics research and application ◽

10.1108/ir-06-2021-0127 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Xiaojun Zhu ◽

Yinghao Liang ◽

Hanxu Sun ◽

Xueqian Wang ◽

Bin Ren

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Obstacle Avoidance ◽

Learning Algorithm ◽

Optimal Path ◽

Environmental Parameters ◽

Working Environment ◽

Content Type ◽

Practical Applications ◽

Human Operators

Purpose Most manufacturing plants choose the easy way of completely separating human operators from robots to prevent accidents, but as a result, it dramatically affects the overall quality and speed that is expected from human–robot collaboration. It is not an easy task to ensure human safety when he/she has entered a robot’s workspace, and the unstructured nature of those working environments makes it even harder. The purpose of this paper is to propose a real-time robot collision avoidance method to alleviate this problem. Design/methodology/approach In this paper, a model is trained to learn the direct control commands from the raw depth images through self-supervised reinforcement learning algorithm. To reduce the effect of sample inefficiency and safety during initial training, a virtual reality platform is used to simulate a natural working environment and generate obstacle avoidance data for training. To ensure a smooth transfer to a real robot, the automatic domain randomization technique is used to generate randomly distributed environmental parameters through the obstacle avoidance simulation of virtual robots in the virtual environment, contributing to better performance in the natural environment. Findings The method has been tested in both simulations with a real UR3 robot for several practical applications. The results of this paper indicate that the proposed approach can effectively make the robot safety-aware and learn how to divert its trajectory to avoid accidents with humans within the workspace. Research limitations/implications The method has been tested in both simulations with a real UR3 robot in several practical applications. The results indicate that the proposed approach can effectively make the robot be aware of safety and learn how to change its trajectory to avoid accidents with persons within the workspace. Originality/value This paper provides a novel collision avoidance framework that allows robots to work alongside human operators in unstructured and complex environments. The method uses end-to-end policy training to directly extract the optimal path from the visual inputs for the scene.

Download Full-text

Study on Characteristics Location of Pantograph–Catenary Contact Force Signal Based on Wavelet Transform

IEEE Transactions on Instrumentation and Measurement ◽

10.1109/tim.2018.2851422 ◽

2019 ◽

Vol 68 (2) ◽

pp. 402-411 ◽

Cited By ~ 8

Author(s):

Jian Zhang ◽

Wenzheng Liu ◽

Zongfang Zhang

Keyword(s):

Wavelet Transform ◽

Contact Force ◽

Force Signal

Download Full-text

Cloud Load Balancing and Reinforcement Learning

Advances in Business Information Systems and Analytics - Cloud Computing Technologies for Green Enterprises ◽

10.4018/978-1-5225-3038-1.ch011 ◽

2018 ◽

pp. 266-291

Author(s):

Abdelghafour Harraz ◽

Mostapha Zbakh

Keyword(s):

Artificial Intelligence ◽

Reinforcement Learning ◽

Load Balancing ◽

Decision Process ◽

Cloud System ◽

Human Intervention ◽

Q Learning ◽

State Action ◽

Learning Techniques ◽

Markov Decision

Artificial Intelligence allows to create engines that are able to explore, learn environments and therefore create policies that permit to control them in real time with no human intervention. It can be applied, through its Reinforcement Learning techniques component, using frameworks such as temporal differences, State-Action-Reward-State-Action (SARSA), Q Learning to name a few, to systems that are be perceived as a Markov Decision Process, this opens door in front of applying Reinforcement Learning to Cloud Load Balancing to be able to dispatch load dynamically to a given Cloud System. The authors will describe different techniques that can used to implement a Reinforcement Learning based engine in a cloud system.

Download Full-text

A technique for measuring contact force distribution in minimally invasive surgical procedures

Proceedings of the Institution of Mechanical Engineers Part H Journal of Engineering in Medicine ◽

10.1243/0954411971534430 ◽

1997 ◽

Vol 211 (4) ◽

pp. 309-316 ◽

Cited By ~ 17

Author(s):

P N Brett ◽

R S W Stone

Keyword(s):

Contact Force ◽

Low Cost ◽

Working Environment ◽

Tactile Sense ◽

Sensory Data ◽

New Methods ◽

Minimally Invasive Surgical ◽

Software Algorithms ◽

Contact Force Distribution ◽

Sense Of Touch

This paper investigates new methods for measuring forces and tactile sense as a contribution towards relaying the sense of touch to the surgeon. The approach used is to determine a distribution of contact force using a small number of sensory outputs to detect the bending of a surface of known behaviour. Software algorithms have been produced to interpret the contacting force from sensory data, and have achieved a bandwidth of 30 Hz and an accuracy of 2 per cent. The sensor construction is of sufficiently low cost to produce a disposable unit and uses materials that are compatible with the invasive working environment.

Download Full-text

Data-Driven Reinforcement-Learning-Based Automatic Bucket-Filling for Wheel Loaders

Applied Sciences ◽

10.3390/app11199191 ◽

2021 ◽

Vol 11 (19) ◽

pp. 9191

Author(s):

Jianfei Huang ◽

Dewen Kong ◽

Guangzong Gao ◽

Xinchun Cheng ◽

Jinshi Chen

Keyword(s):

Reinforcement Learning ◽

Statistical Model ◽

Physical Model ◽

Working Environment ◽

Data Driven ◽

Automated Systems ◽

Learning Capability ◽

Q Learning ◽

System A ◽

Wheel Loaders

Automation of bucket-filling is of crucial significance to the fully automated systems for wheel loaders. Most previous works are based on a physical model, which cannot adapt to the changeable and complicated working environment. Thus, in this paper, a data-driven reinforcement-learning (RL)-based approach is proposed to achieve automatic bucket-filling. An automatic bucket-filling algorithm based on Q-learning is developed to enhance the adaptability of the autonomous scooping system. A nonlinear, non-parametric statistical model is also built to approximate the real working environment using the actual data obtained from tests. The statistical model is used for predicting the state of wheel loaders in the bucket-filling process. Then, the proposed algorithm is trained on the prediction model. Finally, the results of the training confirm that the proposed algorithm has good performance in adaptability, convergence, and fuel consumption in the absence of a physical model. The results also demonstrate the transfer learning capability of the proposed approach. The proposed method can be applied to different machine-pile environments.

Download Full-text

A Functionally Separate Autoencoder

10.36227/techrxiv.12045534.v4 ◽

2021 ◽

Author(s):

Jinxin Wei

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Supervised Learning ◽

Language Translation ◽

Inverse Function ◽

Training Process ◽

Modulation And Demodulation ◽

Round Function ◽

Encryption And Decryption ◽

New Knowledge

According to kids’ learning process, an auto-encoder which can be split into two parts is designed. The two parts can work well separately. The top half is an abstract network which is trained by supervised learning and can be used to classify and regress. The bottom half is a concrete network which is accomplished by inverse function and trained by self-supervised learning. It can generate the input of abstract network from concept or label. The network can achieve its intended functionality through testing by mnist dataset and convolution neural network. Round function is added between the abstract network and concrete network in order to get the representative generation of class. The generation ability can be increased by adding jump connection and negative feedback. At last, the characteristics of the network is discussed. The input can be changed to any form by encoder and then change it back by decoder through inverse function. The concrete network can be seen as the memory stored by the parameters. Lethe is that when new knowledge input, the training process makes the parameters change. At last, the application of the network is discussed. The network can be used for logic generation through deep reinforcement learning. The network can also be used for language translation, zip and unzip, encryption and decryption, compile and decompile, modulation and demodulation.<br>

Download Full-text

Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention

10.1109/icra48506.2021.9561384 ◽

2021 ◽

Author(s):

Abhishek Gupta ◽

Justin Yu ◽

Tony Z. Zhao ◽

Vikash Kumar ◽

Aaron Rovinsky ◽

...

Keyword(s):

Reinforcement Learning ◽

Human Intervention ◽

Dexterous Manipulation ◽

Task Learning

Download Full-text

A Study on Factors Influencing the Job Satisfaction of Bank Employees in Nepal (With special reference to Kathmandu, Lalitpur, and Bhaktapur District)

NCC Journal ◽

10.3126/nccj.v4i1.24728 ◽

2019 ◽

Vol 4 (1) ◽

pp. 9-15 ◽

Cited By ~ 1

Author(s):

Bashu Neupane

Keyword(s):

Job Satisfaction ◽

Sampling Method ◽

Working Environment ◽

Positive Feeling ◽

Research Designs ◽

Behavioral Tendencies ◽

Analytical Research ◽

Bank Employees ◽

Convenience Sampling ◽

Structured Questionnaire

Job satisfaction means the positive feeling or attitude that employees have towards their job, which acts as a motivation to work. It is a combination of emotion, belief, feeling, sentiment, and other allied behavioral tendencies. This study is focused on analyzing the job satisfaction of banking employees on the basis of the working environment, cooperation among employees, training and promotion and salaries. Employees of Nepalese commercial banks were selected using a convenience sampling method for the study. A total of 112 respondents were selected to sample the employees of banks located in Kathmandu, Lalitpur, and Bhaktapur. The descriptive, as well as analytical research designs were used to analyze and draw a conclusion about the job satisfaction of bank employees. The self-structured questionnaire has been used. The major influencing factors for job satisfaction were salary, followed by training and promotion, working environment, and cooperation among them.

Download Full-text