scholarly journals Tensor Based Knowledge Transfer Across Skill Categories for Robot Control

Author(s):  
Chenyang Zhao ◽  
Timothy M. Hospedales ◽  
Freek Stulp ◽  
Olivier Sigaud

Advances in hardware and learning for control are enabling robots to perform increasingly dextrous and dynamic control tasks. These skills typically require a prohibitive amount of exploration for reinforcement learning, and so are commonly achieved by imitation learning from manual demonstration. The costly non-scalable nature of manual demonstration has motivated work into skill generalisation, e.g., through contextual policies and options. Despite good results, existing work along these lines is limited to generalising across variants of one skill such as throwing an object to different locations. In this paper we go significantly further and investigate generalisation across qualitatively different classes of control skills. In particular, we introduce a class of neural network controllers that can realise four distinct skill classes: reaching, object throwing, casting, and ball-in-cup. By factorising the weights of the neural network, we are able to extract transferrable latent skills, that enable dramatic acceleration of learning in cross-task transfer. With a suitable curriculum, this allows us to learn challenging dextrous control tasks like ball-in-cup from scratch with pure reinforcement learning.

2009 ◽  
Author(s):  
◽  
Zhi Li

This research focuses on the design and implementation of an intelligent machine vision and sorting system that can be used to sort objects in an industrial environment. Machine vision systems used for sorting are either geometry driven or are based on the textural components of an object’s image. The vision system proposed in this research is based on the textural analysis of pixel content and uses an artificial neural network to perform the recognition task. The neural network has been chosen over other methods such as fuzzy logic and support vector machines because of its relative simplicity. A Bluetooth communication link facilitates the communication between the main computer housing the intelligent recognition system and the remote robot control computer located in a plant environment. Digital images of the workpiece are first compressed before the feature vectors are extracted using principal component analysis. The compressed data containing the feature vectors is transmitted via the Bluetooth channel to the remote control computer for recognition by the neural network. The network performs the recognition function and transmits a control signal to the robot control computer which guides the robot arm to place the object in an allocated position. The performance of the proposed intelligent vision and sorting system is tested under different conditions and the most attractive aspect of the design is its simplicity. The ability of the system to remain relatively immune to noise, its capacity to generalize and its fault tolerance when faced with missing data made the neural network an attractive option over fuzzy logic and support vector machines.


2008 ◽  
Vol 18 (05) ◽  
pp. 389-403 ◽  
Author(s):  
THOMAS D. JORGENSEN ◽  
BARRY P. HAYNES ◽  
CHARLOTTE C. F. NORLUND

This paper describes a new method for pruning artificial neural networks, using a measure of the neural complexity of the neural network. This measure is used to determine the connections that should be pruned. The measure computes the information-theoretic complexity of a neural network, which is similar to, yet different from previous research on pruning. The method proposed here shows how overly large and complex networks can be reduced in size, whilst retaining learnt behaviour and fitness. The technique proposed here helps to discover a network topology that matches the complexity of the problem it is meant to solve. This novel pruning technique is tested in a robot control domain, simulating a racecar. It is shown, that the proposed pruning method is a significant improvement over the most commonly used pruning method Magnitude Based Pruning. Furthermore, some of the pruned networks prove to be faster learners than the benchmark network that they originate from. This means that this pruning method can also help to unleash hidden potential in a network, because the learning time decreases substantially for a pruned a network, due to the reduction of dimensionality of the network.


2021 ◽  
Vol 19 (3) ◽  
pp. 55-64
Author(s):  
K. N. Maiorov ◽  

The paper examines the life cycle of field development, analyzes the processes of the field development design stage for the application of machine learning methods. For each process, relevant problems are highlighted, existing solutions based on machine learning methods, ideas and problems are proposed that could be effectively solved by machine learning methods. For the main part of the processes, examples of solutions are briefly described; the advantages and disadvantages of the approaches are identified. The most common solution method is feed-forward neural networks. Subject to preliminary normalization of the input data, this is the most versatile algorithm for regression and classification problems. However, in the problem of selecting wells for hydraulic fracturing, a whole ensemble of machine learning models was used, where, in addition to a neural network, there was a random forest, gradient boosting and linear regression. For the problem of optimizing the placement of a grid of oil wells, the disadvantages of existing solutions based on a neural network and a simple reinforcement learning approach based on Markov decision-making process are identified. A deep reinforcement learning algorithm called Alpha Zero is proposed, which has previously shown significant results in the role of artificial intelligence for games. This algorithm is a decision tree search that directs the neural network: only those branches that have received the best estimates from the neural network are considered more thoroughly. The paper highlights the similarities between the tasks for which Alpha Zero was previously used, and the task of optimizing the placement of a grid of oil producing wells. Conclusions are made about the possibility of using and modifying the algorithm of the optimization problem being solved. Аn approach is proposed to take into account symmetric states in a Monte Carlo tree to reduce the number of required simulations.


Author(s):  
Yukari Yamauchi ◽  
◽  
Shun'ichi Tano ◽  

The computational (numerical information) and symbolic (knowledge-based) processing used in intelligent processing has advantages and disadvantages. A simple model integrating symbols into a neural network was proposed as a first step toward fusing computational and symbolic processing. To verify the effectiveness of this model, we first analyze the trained neural network and generate symbols manually. Then we discuss generation methods that are able to discover effective symbols during training of the neural network. We evaluated these through simulations of reinforcement learning in simple football games. Results indicate that the integration of symbols into the neural network improved the performance of player agents.


Sensors ◽  
2018 ◽  
Vol 18 (12) ◽  
pp. 4331 ◽  
Author(s):  
Zhong Ma ◽  
Yuejiao Wang ◽  
Yidai Yang ◽  
Zhuping Wang ◽  
Lei Tang ◽  
...  

When a satellite performs complex tasks such as discarding a payload or capturing a non-cooperative target, it will encounter sudden changes in the attitude and mass parameters, causing unstable flying and rolling of the satellite. In such circumstances, the change of the movement and mass characteristics are unpredictable. Thus, the traditional attitude control methods are unable to stabilize the satellite since they are dependent on the mass parameters of the controlled object. In this paper, we proposed a reinforcement learning method to re-stabilize the attitude of a satellite under such circumstances. Specifically, we discretize the continuous control torque, and build a neural network model that can output the discretized control torque to control the satellite. A dynamics simulation environment of the satellite is built, and the deep Q Network algorithm is then performed to train the neural network in this simulation environment. The reward of the training is the stabilization of the satellite. Simulation experiments illustrate that, with the iteration of training progresses, the neural network model gradually learned to re-stabilize the attitude of a satellite after unknown disturbance. As a contrast, the traditional PD (Proportion Differential) controller was unable to re-stabilize the satellite due to its dependence on the mass parameters. The proposed method adopts self-learning to control satellite attitudes, shows considerable intelligence and certain universality, and has a strong application potential for future intelligent control of satellites performing complex space tasks.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 9864-9874
Author(s):  
Shuo-Wen Chang ◽  
Chung-Ling Chang ◽  
Long-Tin Li ◽  
Shih-Wei Liao

2010 ◽  
Vol 29-32 ◽  
pp. 190-196
Author(s):  
Hong Ya Fu ◽  
Ping Fan Liu ◽  
Qing Chun Zhang ◽  
Guo Dong Li

In order to overcome the system nonlinear instability and uncertainty inherent in magnetic bearing systems, two PID neural network controllers (BP-based and GA-based) are designed and trained to emulate the operation of a complete system. Through the theoretical deduction and simulation results, the principles for the parameters choice of two neural network controllers are given. The feasibility of using the neural network to control nonlinear magnetic bearing systems with un-known dynamics is demonstrated. The robust performance and reinforcement learning capability in controlling magnetic bearing systems are compared between two PID neural network controllers.


2005 ◽  
Vol 2 (2) ◽  
pp. 97-102 ◽  
Author(s):  
C. DaSalla ◽  
J. Kim ◽  
Y. Koike

The aim of this paper is to design a human–interface system, using EMG signals elicited by various wrist movements, to control a robot. EMG signals are normalized and based on joint torque. A three-layer neural network is used to estimate posture of the wrist and forearm from EMG signals. After training the neural network and obtaining appropriate weights, the subject was able to control the robot in real time using wrist and forearm movements.


Author(s):  
Alexsander Voevoda ◽  
◽  
Dmitry Romannikov ◽  

The application of neural networks for the synthesis of control systems is considered. Examples of synthesis of control systems using methods of reinforcement learning, in which the state vector is involved, are given. And the synthesis of a neural controller for objects with an inaccessible state vector is discussed: 1) a variant using a neural network with recurrent feedbacks; 2) a variant using the input error vector, where each error (except for the first one) enters the input of the neural network passing through the delay element. The disadvantages of the first method include the fact that for such a structure of a neural network it is not possible to apply existing learning methods with confirmation and for training it is required to use a data set obtained, for example, from a previously calculated linear controller. The structure of the neural network used in the second option allows the application of reinforcement learning methods, but the article provides a statement and its proof that for the synthesis of a control system for objects with three or more integrators, a neural network without recurrent connections cannot be used. The application of the above structures is given on examples of the synthesis of control systems for objects 1/s2 and 1/s3 presented in a discrete form.


Author(s):  
C J Fourie

This paper describes the use of an artificial neural network in conjunction with reinforcement learning techniques to develop an intelligent scheduling system that is capable of learning from experience. In a simulated environment the model controls a mobile robot that transports material to machines. States of ‘happiness’ are defined for each machine, which are the inputs to the neural network. The output of the neural network is the decision on which machine to service next. After every decision, a critic evaluates the decision and a teacher ‘rewards’ the network to encourage good decisions and discourage bad decisions. From the results obtained, it is concluded that the proposed model is capable of learning from past experience and thereby improving the intelligence of the system.


Sign in / Sign up

Export Citation Format

Share Document