scholarly journals Alignment Method of Combined Perception for Peg-in-Hole Assembly with Deep Reinforcement Learning

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yongzhi Wang ◽  
Lei Zhao ◽  
Qian Zhang ◽  
Ran Zhou ◽  
Liping Wu ◽  
...  

The method of tactile perception can accurately reflect the contact state by collecting force and torque information, but it is not sensitive to the changes in position and posture between assembly objects. The method of visual perception is very sensitive to changes in pose and posture between assembled objects, but they cannot accurately reflect the contact state, especially since the objects are occluded from each other. The robot will perceive the environment more accurately if visual and tactile perception can be combined. Therefore, this paper proposes the alignment method of combined perception for the peg-in-hole assembly with self-supervised deep reinforcement learning. The agent first observes the environment through visual sensors and then predicts the action of the alignment adjustment based on the visual feature of the contact state. Subsequently, the agent judges the contact state based on the force and torque information collected by the force/torque sensor. And the action of the alignment adjustment is selected according to the contact state and used as a visual prediction label. Whereafter, the network of visual perception performs backpropagation to correct the network weights according to the visual prediction label. Finally, the agent will have learned the alignment skill of combined perception with the increase of iterative training. The robot system is built based on CoppeliaSim for simulation training and testing. The simulation results show that the method of combined perception has higher assembly efficiency than single perception.

2021 ◽  
Vol 11 (7) ◽  
pp. 2895
Author(s):  
Ahmed Elfakharany ◽  
Zool Hilmi Ismail

In this paper, we present a novel deep reinforcement learning (DRL) based method that is used to perform multi-robot task allocation (MRTA) and navigation in an end-to-end fashion. The policy operates in a decentralized manner mapping raw sensor measurements to the robot’s steering commands without the need to construct a map of the environment. We also present a new metric called the Task Allocation Index (TAI), which measures the performance of a method that performs MRTA and navigation from end-to-end in performing MRTA. The policy was trained on a simulated gazebo environment. The centralized learning and decentralized execution paradigm was used for training the policy. The policy was evaluated quantitatively and visually. The simulation results showed the effectiveness of the proposed method deployed on multiple Turtlebot3 robots.


2021 ◽  
Vol 11 (2) ◽  
pp. 546
Author(s):  
Jiajia Xie ◽  
Rui Zhou ◽  
Yuan Liu ◽  
Jun Luo ◽  
Shaorong Xie ◽  
...  

The high performance and efficiency of multiple unmanned surface vehicles (multi-USV) promote the further civilian and military applications of coordinated USV. As the basis of multiple USVs’ cooperative work, considerable attention has been spent on developing the decentralized formation control of the USV swarm. Formation control of multiple USV belongs to the geometric problems of a multi-robot system. The main challenge is the way to generate and maintain the formation of a multi-robot system. The rapid development of reinforcement learning provides us with a new solution to deal with these problems. In this paper, we introduce a decentralized structure of the multi-USV system and employ reinforcement learning to deal with the formation control of a multi-USV system in a leader–follower topology. Therefore, we propose an asynchronous decentralized formation control scheme based on reinforcement learning for multiple USVs. First, a simplified USV model is established. Simultaneously, the formation shape model is built to provide formation parameters and to describe the physical relationship between USVs. Second, the advantage deep deterministic policy gradient algorithm (ADDPG) is proposed. Third, formation generation policies and formation maintenance policies based on the ADDPG are proposed to form and maintain the given geometry structure of the team of USVs during movement. Moreover, three new reward functions are designed and utilized to promote policy learning. Finally, various experiments are conducted to validate the performance of the proposed formation control scheme. Simulation results and contrast experiments demonstrate the efficiency and stability of the formation control scheme.


2013 ◽  
Vol 415 ◽  
pp. 143-148
Author(s):  
Li Hua Zhu ◽  
Xiang Hong Cheng

The design of an improved alignment method of SINS on a swaying base is presented in this paper. FIR filter is taken to decrease the impact caused by the lever arm effect. And the system also encompasses the online estimation of gyroscopes’ drift with Kalman filter in order to do the compensation, and the inertial freezing alignment algorithm which helps to resolve the attitude matrix with respect to its fast and robust property to provide the mathematical platform for the vehicle. Simulation results show that the proposed method is efficient for the initial alignment of the swaying base navigation system.


2021 ◽  
Vol 10 (4) ◽  
pp. 1-27
Author(s):  
Shengxin Jia ◽  
Veronica J. Santos

The sense of touch is essential for locating buried objects when vision-based approaches are limited. We present an approach for tactile perception when sensorized robot fingertips are used to directly interact with granular media particles in teleoperated systems. We evaluate the effects of linear and nonlinear classifier model architectures and three tactile sensor modalities (vibration, internal fluid pressure, fingerpad deformation) on the accuracy of estimates of fingertip contact state. We propose an architecture called the Sparse-Fusion Recurrent Neural Network (SF-RNN) in which sparse features are autonomously extracted prior to fusing multimodal tactile data in a fully connected RNN input layer. The multimodal SF-RNN model achieved 98.7% test accuracy and was robust to modest variations in granular media type and particle size, fingertip orientation, fingertip speed, and object location. Fingerpad deformation was the most informative modality for haptic exploration within granular media while vibration and internal fluid pressure provided additional information with appropriate signal processing. We introduce a real-time visualization of tactile percepts for remote exploration by constructing a belief map that combines probabilistic contact state estimates and fingertip location. The belief map visualizes the probability of an object being buried in the search region and could be used for planning.


2021 ◽  
Vol 01 ◽  
Author(s):  
Ying Li ◽  
Chubing Guo ◽  
Jianshe Wu ◽  
Xin Zhang ◽  
Jian Gao ◽  
...  

Background: Unmanned systems have been widely used in multiple fields. Many algorithms have been proposed to solve path planning problems. Each algorithm has its advantages and defects and cannot adapt to all kinds of requirements. An appropriate path planning method is needed for various applications. Objective: To select an appropriate algorithm fastly in a given application. This could be helpful for improving the efficiency of path planning for Unmanned systems. Methods: This paper proposes to represent and quantify the features of algorithms based on the physical indicators of results. At the same time, an algorithmic collaborative scheme is developed to search the appropriate algorithm according to the requirement of the application. As an illustration of the scheme, four algorithms, including the A-star (A*) algorithm, reinforcement learning, genetic algorithm, and ant colony optimization algorithm, are implemented in the representation of their features. Results: In different simulations, the algorithmic collaborative scheme can select an appropriate algorithm in a given application based on the representation of algorithms. And the algorithm could plan a feasible and effective path. Conclusion: An algorithmic collaborative scheme is proposed, which is based on the representation of algorithms and requirement of the application. The simulation results prove the feasibility of the scheme and the representation of algorithms.


Author(s):  
E. Tabarah ◽  
B. Benhabib ◽  
R. G. Fenton ◽  
G. Hexner

Abstract A new method is presented for the optimal coordination of a two-robot system performing contact operations. One of the robots carries a tool and performs the specific contact operation on a workpiece which is grasped and maneuvered by the second robot. The two robots move simultaneously relative to each other so that the tool maintains contact with the workpiece while moving along its prescribed trajectory at a constant speed. This prescribed trajectory, which is specified with respect to the workpiece frame, is thus resolved into a pair of conjugate trajectories, one for each robot, and specified in the world coordinate frame. This resolution process does not yield a unique solution, i.e. there exist an infinity of conjugate-trajectory pairs corresponding to a given tool trajectory. This paper presents a technique for resolving the original tool trajectory, where the robots’ conjugate trajectories are parameterized using polynomial functions. A method is then developed for selecting the optimal pair of conjugate trajectories on the basis of minimizing a given choice of cost function. This optimization is further enhanced by coupling it to a procedure for selecting the optimal layout of the robots within the workcell, resulting in the best possible solutions. Numerical simulation results support the validity of the proposed technique.


2019 ◽  
Vol 9 (22) ◽  
pp. 4964 ◽  
Author(s):  
Yue ◽  
Guan ◽  
Wang

In this paper, the important topic of cooperative searches for multi-dynamic targets in unknown sea areas by unmanned aerial vehicles (UAVs) is studied based on a reinforcement learning (RL) algorithm. A novel multi-UAV sea area search map is established, in which models of the environment, UAV dynamics, target dynamics, and sensor detection are involved. Then, the search map is updated and extended using the concept of the territory awareness information map. Finally, according to the search efficiency function, a reward and punishment function is designed, and an RL method is used to generate a multi-UAV cooperative search path online. The simulation results show that the proposed algorithm could effectively perform the search task in the sea area with no prior information.


Sign in / Sign up

Export Citation Format

Share Document