An Online Training Method for Augmenting MPC with Deep Reinforcement Learning

Author(s):  
Guillaume Bellegarda ◽  
Katie Byl
Author(s):  
Chen Chen ◽  
Shuai Mu ◽  
Wanpeng Xiao ◽  
Zexiong Ye ◽  
Liesi Wu ◽  
...  

In this paper, we propose a novel conditional-generativeadversarial-nets-based image captioning framework as an extension of traditional reinforcement-learning (RL)-based encoder-decoder architecture. To deal with the inconsistent evaluation problem among different objective language metrics, we are motivated to design some “discriminator” networks to automatically and progressively determine whether generated caption is human described or machine generated. Two kinds of discriminator architectures (CNN and RNNbased structures) are introduced since each has its own advantages. The proposed algorithm is generic so that it can enhance any existing RL-based image captioning framework and we show that the conventional RL training method is just a special case of our approach. Empirically, we show consistent improvements over all language evaluation metrics for different state-of-the-art image captioning models. In addition, the well-trained discriminators can also be viewed as objective image captioning evaluators.


Author(s):  
Qianlong Liu ◽  
Baoliang Cui ◽  
Zhongyu Wei ◽  
Baolin Peng ◽  
Haikuan Huang ◽  
...  

Interactive search, where a set of tags is recommended to users together with search results at each turn, is an effective way to guide users to identify their information need. It is a classical sequential decision problem and the reinforcement learning based agent can be introduced as a solution. The training of the agent can be divided into two stages, i.e., offline and online. Existing reinforcement learning based systems tend to perform the offline training in a supervised way based on historical labeled data while the online training is performed via reinforcement learning algorithms based on interactions with real users. The mis-match between online and offline training leads to a cold-start problem for the online usage of the agent. To address this issue, we propose to employ a simulator to mimic the environment for the offline training of the agent. Users' profiles are considered to build a personalized simulator, besides, model-based approach is used to train the simulator and is able to use the data efficiently. Experimental results based on real-world dataset demonstrate the effectiveness of our agent and personalized simulator.


Electronics ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 996
Author(s):  
Wooseok Song ◽  
Woong Hyun Suh ◽  
Chang Wook Ahn

This paper proposes a DRL -based training method for spellcaster units in StarCraft II, one of the most representative Real-Time Strategy (RTS) games. During combat situations in StarCraft II, micro-controlling various combat units is crucial in order to win the game. Among many other combat units, the spellcaster unit is one of the most significant components that greatly influences the combat results. Despite the importance of the spellcaster units in combat, training methods to carefully control spellcasters have not been thoroughly considered in related studies due to the complexity. Therefore, we suggest a training method for spellcaster units in StarCraft II by using the A3C algorithm. The main idea is to train two Protoss spellcaster units under three newly designed minigames, each representing a unique spell usage scenario, to use ‘Force Field’ and ‘Psionic Storm’ effectively. As a result, the trained agents show winning rates of more than 85% in each scenario. We present a new training method for spellcaster units that releases the limitation of StarCraft II AI research. We expect that our training method can be used for training other advanced and tactical units by applying transfer learning in more complex minigame scenarios or full game maps.


2019 ◽  
Vol 7 (7) ◽  
pp. 1214-1219
Author(s):  
Es-hagh Ildarabadi ◽  
Mohammad Ghasem Tabei ◽  
Ameneh Mosaferi Khosh

BACKGROUND: Self-care training is one of the strategies used to control diabetes. There is some ambiguity about the appropriate method for educating middle-aged and older adults about self-care. AIM: This study aimed to compare the effects of face-to-face and online training on self-care levels in middle-aged and older adults with type 2 diabetes. MATERIAL AND METHODS: In a randomised clinical trial, 84 middle-aged and older adults with type 2 diabetes who had been referred to the Diabetes Clinic of Esfarayen in Iran, were evaluated. Patients who meet the inclusion criteria were randomly assigned into two groups. Diabetes self-care education (DSCE) was provided using a face-to-face training method in one group and using online training method in the other group. The summary of diabetes self-care activities (SDSCA) questionnaire was completed at baseline and 1 month after training. RESULTS: The mean and standard deviation of self-care scores before and 1 month after training were 43.16 ± 14.94 and 65.76 ± 10.65 in the face-to-face training group, and 37 ± 10.75 and 56.82 ± 12.06 in the online training group, respectively. The differences in the self-care scores were significant both before and after the intervention in the two groups (p < 0.05). Although the difference was greater in the face-to-face training group than in the online training group, it was not statistically significant (P > 0.05). CONCLUSION: Both face-to-face and online training had a similar effect on the self-care levels in middle-aged and older adults with type 2 diabetes. Therefore, both training methods could be used as effective techniques to meet the needs and educational requirements of middle-aged and older adults with type 2 diabetes.


Author(s):  
Laurie Ehlhardt Powell ◽  
Tracey Wallace ◽  
Michelle ranae Wild

Research shows that if clinicians are to deliver effective, evidence-based assistive technology for cognition (ATC) services to clients with acquired brain injury (ABI), they first need opportunities to gain knowledge and experience with ATC assessment and training practices (O'Neil-Pirozzi, Kendrick, Goldstein, & Glenn, 2004). This article describes three examples of train the trainer materials and programs to address this need: (a) a toolkit for trainers to learn more about assessing and training ATC; (b) a comprehensive, trans-disciplinary program for training staff to provide ATC services in a metropolitan area; and (c) an overview of an on-site/online training package for rehabilitation professionals working with individuals with ABI in remote locations.


Sign in / Sign up

Export Citation Format

Share Document