scholarly journals Reinforcement Learning for Uncooperative Space Objects Smart Imaging Path-Planning

Author(s):  
Andrea Brandonisio ◽  
Michèle Lavagna ◽  
Davide Guzzetti

AbstractLeading space agencies are increasingly investing in the gradual automation of space missions. In fact, autonomous flight operations may be a key enabler for on-orbit servicing, assembly and manufacturing (OSAM) missions, carrying inherent benefits such as cost and risk reduction. Within the spectrum of proximity operations, this work focuses on autonomous path-planning for the reconstruction of geometry properties of an uncooperative target. The autonomous navigation problem is called active Simultaneous Localization and Mapping (SLAM) problem, and it has been largely studied within the field of robotics. Active SLAM problem may be formulated as a Partially Observable Markov Decision Process (POMDP). Previous works in astrodynamics have demonstrated that is possible to use Reinforcement Learning (RL) techniques to teach an agent that is moving along a pre-determined orbit when to collect measurements to optimize a given mapping goal. In this work, different RL methods are explored to develop an artificial intelligence agent capable of planning sub-optimal paths for autonomous shape reconstruction of an unknown and uncooperative object via imaging. Proximity orbit dynamics are linearized and include orbit eccentricity. The geometry of the target object is rendered by a polyhedron shaped with a triangular mesh. Artificial intelligent agents are created using both the Deep Q-Network (DQN) and the Advantage Actor Critic (A2C) method. State-action value functions are approximated using Artificial Neural Networks (ANN) and trained according to RL principles. Training of the RL agent architecture occurs under fixed or random initial environment conditions. A large database of training tests has been collected. Trained agents show promising performance in achieving extended coverage of the target. Policy learning is demonstrated by displaying that RL agents, at minimum, have higher mapping performance than agents that behave randomly. Furthermore, RL agent may learn to maneuver the spacecraft to control target lighting conditions as a function of the Sun location. This work, therefore, preliminary demonstrates the applicability of RL to autonomous imaging of an uncooperative space object, thus setting a baseline for future works.

2021 ◽  
Vol 2138 (1) ◽  
pp. 012011
Author(s):  
Yanwei Zhao ◽  
Yinong Zhang ◽  
Shuying Wang

Abstract Path planning refers to that the mobile robot can obtain the surrounding environment information and its own state information through the sensor carried by itself, which can avoid obstacles and move towards the target point. Deep reinforcement learning consists of two parts: reinforcement learning and deep learning, mainly used to deal with perception and decision-making problems, has become an important research branch in the field of artificial intelligence. This paper first introduces the basic knowledge of deep learning and reinforcement learning. Then, the research status of deep reinforcement learning algorithm based on value function and strategy gradient in path planning is described, and the application research of deep reinforcement learning in computer game, video game and autonomous navigation is described. Finally, I made a brief summary and outlook on the algorithms and applications of deep reinforcement learning.


2021 ◽  
Author(s):  
Salvador Ortiz ◽  
Wen Yu

In this paper, sliding mode control is combined with the classical simultaneous localization and mapping (SLAM) method. This combination can overcome the problem of bounded uncertainties in SLAM. With the help of genetic algorithm, our novel path planning method shows many advantages compared with other popular methods.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Shuhuan Wen ◽  
Xiaohan Lv ◽  
Hak Keung Lam ◽  
Shaokang Fan ◽  
Xiao Yuan ◽  
...  

Purpose This paper aims to use the Monodepth method to improve the prediction speed of identifying the obstacles and proposes a Probability Dueling DQN algorithm to optimize the path of the agent, which can reach the destination more quickly than the Dueling DQN algorithm. Then the path planning algorithm based on Probability Dueling DQN is combined with FastSLAM to accomplish the autonomous navigation and map the environment. Design/methodology/approach This paper proposes an active simultaneous localization and mapping (SLAM) framework for autonomous navigation under an indoor environment with static and dynamic obstacles. It integrates a path planning algorithm with visual SLAM to decrease navigation uncertainty and build an environment map. Findings The result shows that the proposed method offers good performance over existing Dueling DQN for navigation uncertainty under the indoor environment with different numbers and shapes of the static and dynamic obstacles in the real world field. Originality/value This paper proposes a novel active SLAM framework composed of Probability Dueling DQN that is the improved path planning algorithm based on Dueling DQN and FastSLAM. This framework is used with the Monodepth depth image prediction method with faster prediction speed to realize autonomous navigation in the indoor environment with different numbers and shapes of the static and dynamic obstacles.


Author(s):  
Jiaxuan Fan ◽  
Zhenya Wang ◽  
Jinlei Ren ◽  
Ying Lu ◽  
Yiheng Liu

Author(s):  
Jie Zhong ◽  
Tao Wang ◽  
Lianglun Cheng

AbstractIn actual welding scenarios, an effective path planner is needed to find a collision-free path in the configuration space for the welding manipulator with obstacles around. However, as a state-of-the-art method, the sampling-based planner only satisfies the probability completeness and its computational complexity is sensitive with state dimension. In this paper, we propose a path planner for welding manipulators based on deep reinforcement learning for solving path planning problems in high-dimensional continuous state and action spaces. Compared with the sampling-based method, it is more robust and is less sensitive with state dimension. In detail, to improve the learning efficiency, we introduce the inverse kinematics module to provide prior knowledge while a gain module is also designed to avoid the local optimal policy, we integrate them into the training algorithm. To evaluate our proposed planning algorithm in multiple dimensions, we conducted multiple sets of path planning experiments for welding manipulators. The results show that our method not only improves the convergence performance but also is superior in terms of optimality and robustness of planning compared with most other planning algorithms.


Sign in / Sign up

Export Citation Format

Share Document