scholarly journals A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Control

Author(s):  
Pei Xu ◽  
Ioannis Karamouzas

We present a simple and intuitive approach for interactive control of physically simulated characters. Our work builds upon generative adversarial networks (GAN) and reinforcement learning, and introduces an imitation learning framework where an ensemble of classifiers and an imitation policy are trained in tandem given pre-processed reference clips. The classifiers are trained to discriminate the reference motion from the motion generated by the imitation policy, while the policy is rewarded for fooling the discriminators. Using our GAN-like approach, multiple motor control policies can be trained separately to imitate different behaviors. In runtime, our system can respond to external control signal provided by the user and interactively switch between different policies. Compared to existing method, our proposed approach has the following attractive properties: 1) achieves state-of-the-art imitation performance without manually designing and fine tuning a reward function; 2) directly controls the character without having to track any target reference pose explicitly or implicitly through a phase state; and 3) supports interactive policy switching without requiring any motion generation or motion matching mechanism. We highlight the applicability of our approach in a range of imitation and interactive control tasks, while also demonstrating its ability to withstand external perturbations as well as to recover balance. Overall, our approach has low runtime cost and can be easily integrated into interactive applications and games.

2022 ◽  
Vol 8 ◽  
Author(s):  
Yan Wang ◽  
Cristian C. Beltran-Hernandez ◽  
Weiwei Wan ◽  
Kensuke Harada

Complex contact-rich insertion is a ubiquitous robotic manipulation skill and usually involves nonlinear and low-clearance insertion trajectories as well as varying force requirements. A hybrid trajectory and force learning framework can be utilized to generate high-quality trajectories by imitation learning and find suitable force control policies efficiently by reinforcement learning. However, with the mentioned approach, many human demonstrations are necessary to learn several tasks even when those tasks require topologically similar trajectories. Therefore, to reduce human repetitive teaching efforts for new tasks, we present an adaptive imitation framework for robot manipulation. The main contribution of this work is the development of a framework that introduces dynamic movement primitives into a hybrid trajectory and force learning framework to learn a specific class of complex contact-rich insertion tasks based on the trajectory profile of a single task instance belonging to the task class. Through experimental evaluations, we validate that the proposed framework is sample efficient, safer, and generalizes better at learning complex contact-rich insertion tasks on both simulation environments and on real hardware.


Author(s):  
Cong Fei ◽  
Bin Wang ◽  
Yuzheng Zhuang ◽  
Zongzhang Zhang ◽  
Jianye Hao ◽  
...  

Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning. However, the requirement of isolated single modal demonstrations limits the scalability of the approach to real world scenarios such as autonomous vehicles' demand for a proper understanding of human drivers' behavior. In this paper, we propose a novel multi-modal GAIL framework, named Triple-GAIL, that is able to learn skill selection and imitation jointly from both expert demonstrations and continuously generated experiences with data augmentation purpose by introducing an auxiliary selector. We provide theoretical guarantees on the convergence to optima for both of the generator and the selector respectively. Experiments on real driver trajectories and real-time strategy game datasets demonstrate that Triple-GAIL can better fit multi-modal behaviors close to the demonstrators and outperforms state-of-the-art methods.


Product evaluations are precious for upcoming clients in supporting them make choices. To this, numerous mining techniques have been proposed, wherein judging a evaluation sentence’s orientation (e.g. Outstanding or bad) is considered as one of their key worrying conditions. Lately, deep studying has emerged as a powerful technique for fixing sentiment kind issues. A neural network intrinsically learns useful instance routinely without human efforts. But, the fulfilment of deep getting to know pretty is primarily based totally on the supply of big-scale education data. We recommend a unique deep studying framework for product review sentiment classification which employs prevalently to be had rankings as susceptible supervision signs and symptoms. The framework consists of steps: (1) studying a high level representation (an embedding region) which captures the general sentiment distribution of sentences thru score facts; (2) such as a class layer-on top of the embedding layer and use labelled sentences for supervised fine-tuning. We discover styles of low stage community structure for modelling evaluation sentences, specifically, convolution function extractors and prolonged brieftime period memory. To have a take a look at the proposed framework, we gather a data set containing 1.1M weakly classified evaluate sentences and eleven, 754 labelled review sentences from Amazon. Experimental effects display the efficacy of the proposed framework and its superiority over baselines. In this future work todetect false reviews given by robots or by malicious people by taking amount, sometimessome companies may hire people to boost their product ranking higher by assigning fake rating and this malicious people or robots give continuous ranking or review to such product and we can detect such fake rating by analysingratingandremove suchfake rating to give only genuine reviews to users.


Author(s):  
Luis A. Souza ◽  
Leandro A. Passos ◽  
Robert Mendel ◽  
Alanna Ebigbo ◽  
Andreas Probst ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Jianye Zhou ◽  
Xinyu Yang ◽  
Lin Zhang ◽  
Siyu Shao ◽  
Gangying Bian

To realize high-precision and high-efficiency machine fault diagnosis, a novel deep learning framework that combines transfer learning and transposed convolution is proposed. Compared with existing methods, this method has faster training speed, fewer training samples per time, and higher accuracy. First, the raw data collected by multiple sensors are combined into a graph and normalized to facilitate model training. Next, the transposed convolution is utilized to expand the image resolution, and then the images are treated as the input of the transfer learning model for training and fine-tuning. The proposed method adopts 512 time series to conduct experiments on two main mechanical datasets of bearings and gears in the variable-speed gearbox, which verifies the effectiveness and versatility of the method. We have obtained advanced results on both datasets of the gearbox dataset. The dataset shows that the test accuracy is 99.99%, achieving a significant improvement from 98.07% to 99.99%.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Guisheng Hou ◽  
Shuo Xu ◽  
Nan Zhou ◽  
Lei Yang ◽  
Quanhao Fu

Accurate predictions of remaining useful life (RUL) of important components play a crucial role in system reliability, which is the basis of prognostics and health management (PHM). This paper proposed an integrated deep learning approach for RUL prediction of a turbofan engine by integrating an autoencoder (AE) with a deep convolutional generative adversarial network (DCGAN). In the pretraining stage, the reconstructed data of the AE not only participate in its error reconstruction but also take part in the DCGAN parameter training as the generated data of the DCGAN. Through double-error reconstructions, the capability of feature extraction is enhanced, and high-level abstract information is obtained. In the fine-tuning stage, a long short-term memory (LSTM) network is used to extract the sequential information from the features to predict the RUL. The effectiveness of the proposed scheme is verified on the NASA commercial modular aero-propulsion system simulation (C-MAPSS) dataset. The superiority of the proposed method is demonstrated via excellent prediction performance and comparisons with other existing state-of-the-art prognostics. The results of this study suggest that the proposed data-driven prognostic method offers a new and promising prediction approach and an efficient feature extraction scheme.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5034
Author(s):  
Yang Zhou ◽  
Rui Fu ◽  
Chang Wang ◽  
Ruibin Zhang

Building a human-like car-following model that can accurately simulate drivers’ car-following behaviors is helpful to the development of driving assistance systems and autonomous driving. Recent studies have shown the advantages of applying reinforcement learning methods in car-following modeling. However, a problem has remained where it is difficult to manually determine the reward function. This paper proposes a novel car-following model based on generative adversarial imitation learning. The proposed model can learn the strategy from drivers’ demonstrations without specifying the reward. Gated recurrent units was incorporated in the actor-critic network to enable the model to use historical information. Drivers’ car-following data collected by a test vehicle equipped with a millimeter-wave radar and controller area network acquisition card was used. The participants were divided into two driving styles by K-means with time-headway and time-headway when braking used as input features. Adopting five-fold cross-validation for model evaluation, the results show that the proposed model can reproduce drivers’ car-following trajectories and driving styles more accurately than the intelligent driver model and the recurrent neural network-based model, with the lowest average spacing error (19.40%) and speed validation error (5.57%), as well as the lowest Kullback-Leibler divergences of the two indicators used for driving style clustering.


Sign in / Sign up

Export Citation Format

Share Document