stable learning
Recently Published Documents


TOTAL DOCUMENTS

73
(FIVE YEARS 15)

H-INDEX

16
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Alejandro Guarneros-Sandoval ◽  
Mariana Ballesteros ◽  
Ivan Salgado ◽  
Julia Rodríguez-Santillán ◽  
Isaac Chairez

2021 ◽  
Vol 4 (1) ◽  
pp. 81
Author(s):  
Edgardo Leopoldo Maza-Ortega ◽  
Carmen Cecilia Espinoza-Melo

This research reveals the change that occurs in a university course when using narratives, which allows facing monumentalism in the classrooms, considering that, in most of them, classes are still being held in a traditional way without taking into account what the students think about the topics covered in the courses. The research is within the qualitative method. The results obtained allow to establish the influence of the course on the learning strategies used by the students, as stable learning indicators, two networks are presented: one corresponding to the use of narratives, and the second, incorporation of the question. Students are motivated to work collaboratively and have a favorable opinion of implementation.


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3409
Author(s):  
Eunjin Jung ◽  
Incheol Kim

This study proposes a novel hybrid imitation learning (HIL) framework in which behavior cloning (BC) and state cloning (SC) methods are combined in a mutually complementary manner to enhance the efficiency of robotic manipulation task learning. The proposed HIL framework efficiently combines BC and SC losses using an adaptive loss mixing method. It uses pretrained dynamics networks to enhance SC efficiency and performs stochastic state recovery to ensure stable learning of policy networks by transforming the learner’s task state into a demo state on the demo task trajectory during SC. The training efficiency and policy flexibility of the proposed HIL framework are demonstrated in a series of experiments conducted to perform major robotic manipulation tasks (pick-up, pick-and-place, and stack tasks). In the experiments, the HIL framework showed about a 2.6 times higher performance improvement than the pure BC and about a four times faster training time than the pure SC imitation learning method. In addition, the HIL framework also showed about a 1.6 times higher performance improvement and about a 2.2 times faster training time than the other hybrid learning method combining BC and reinforcement learning (BC + RL) in the experiments.


Author(s):  
Zheyan Shen ◽  
Peng Cui ◽  
Jiashuo Liu ◽  
Tong Zhang ◽  
Bo Li ◽  
...  
Keyword(s):  

Author(s):  
Wenjie Shi ◽  
Shiji Song ◽  
Cheng Wu

Maximum entropy deep reinforcement learning (RL) methods have been demonstrated on a range of challenging continuous tasks. However, existing methods either suffer from severe instability when training on large off-policy data or cannot scale to tasks with very high state and action dimensionality such as 3D humanoid locomotion. Besides, the optimality of desired Boltzmann policy set for non-optimal soft value function is not persuasive enough. In this paper, we first derive soft policy gradient based on entropy regularized expected reward objective for RL with continuous actions. Then, we present an off-policy actor-critic, model-free maximum entropy deep RL algorithm called deep soft policy gradient (DSPG) by combining soft policy gradient with soft Bellman equation. To ensure stable learning while eliminating the need of two separate critics for soft value functions, we leverage double sampling approach to making the soft Bellman equation tractable. The experimental results demonstrate that our method outperforms in performance over off-policy prior methods.


Sign in / Sign up

Export Citation Format

Share Document