A Q-Learning Framework for User QoE Enhanced Self-Organizing Spectrally Efficient Network Using a Novel Inter-Operator Proximal Spectrum Sharing

Reinforcement learning is one of the most promising machine learning techniques to get intelligent behaviors for embodied agents in simulations. The output of the classic Temporal Difference family of Reinforcement Learning algorithms adopts the form of a value function expressed as a numeric table or a function approximator. The learned behavior is then derived using a greedy policy with respect to this value function. Nevertheless, sometimes the learned policy does not meet expectations, and the task of authoring is difficult and unsafe because the modification of one value or parameter in the learned value function has unpredictable consequences in the space of the policies it represents. This invalidates direct manipulation of the learned value function as a method to modify the derived behaviors. In this paper, we propose the use of Inverse Reinforcement Learning to incorporate real behavior traces in the learning process to shape the learned behaviors, thus increasing their trustworthiness (in terms of conformance to reality). To do so, we adapt the Inverse Reinforcement Learning framework to the navigation problem domain. Specifically, we use Soft Q-learning, an algorithm based on the maximum causal entropy principle, with MARL-Ped (a Reinforcement Learning-based pedestrian simulator) to include information from trajectories of real pedestrians in the process of learning how to navigate inside a virtual 3D space that represents the real environment. A comparison with the behaviors learned using a Reinforcement Learning classic algorithm (Sarsa(λ)) shows that the Inverse Reinforcement Learning behaviors adjust significantly better to the real trajectories.

Download Full-text

Unsupervised spectral–spatial multiscale feature learning framework for hyperspectral image classification based on multiple kernel self-organizing maps

Journal of Applied Remote Sensing ◽

10.1117/1.jrs.14.046503 ◽

2020 ◽

Vol 14 (04) ◽

Cited By ~ 1

Author(s):

Noha Khattab ◽

Shaheera Rashwan ◽

Hala M. Ebied ◽

Walaa Sheta ◽

Howida Shedeed ◽

...

Keyword(s):

Image Classification ◽

Hyperspectral Image ◽

Feature Learning ◽

Hyperspectral Image Classification ◽

Self Organizing Maps ◽

Learning Framework ◽

Multiple Kernel ◽

Self Organizing

Download Full-text

Experimental Design and Control of a Smart Morphing Wing System using a Q-learning Framework

10.1109/ccta48906.2021.9658986 ◽

2021 ◽

Author(s):

Aqib A. Syed ◽

Thanakorn Khamvilai ◽

Yoobin Kim ◽

Kyriakos G. Vamvoudakis

Keyword(s):

Experimental Design ◽

Morphing Wing ◽

Q Learning ◽

Learning Framework ◽

And Control

Download Full-text

Primal-Dual Q-Learning Framework for LQR Design

IEEE Transactions on Automatic Control ◽

10.1109/tac.2018.2884649 ◽

2019 ◽

Vol 64 (9) ◽

pp. 3756-3763 ◽

Cited By ~ 1

Author(s):

Donghwan Lee ◽

Jianghai Hu

Keyword(s):

Q Learning ◽

Learning Framework ◽

Primal Dual

Download Full-text

Q-Learning of Straightforward Gait Pattern for Humanoid Robot Based on Automatic Training Platform

Electronics ◽

10.3390/electronics8060615 ◽

2019 ◽

Vol 8 (6) ◽

pp. 615 ◽

Cited By ~ 1

Author(s):

Ching-Chang Wong ◽

Chih-Cheng Liu ◽

Sheng-Ru Xiao ◽

Hao-Yu Yang ◽

Meng-Cheng Lau

Keyword(s):

Learning Process ◽

Humanoid Robot ◽

Gait Pattern ◽

Bipedal Locomotion ◽

Q Learning ◽

Learning Framework ◽

Automated Learning ◽

Field Programmable ◽

Fpga Chip ◽

Sinusoidal Functions

In this paper, an oscillator-based gait pattern with sinusoidal functions is designed and implemented on a field-programmable gate array (FPGA) chip to generate a trajectory plan and achieve bipedal locomotion for a small-sized humanoid robot. In order to let the robot can walk straight, the turning direction is viewed as a parameter of the gait pattern and Q-learning is used to obtain a straightforward gait pattern. Moreover, an automatic training platform is designed so that the learning process is automated. In this way, the turning direction can be adjusted flexibly and efficiently under the supervision of the automatic training platform. The experimental results show that the proposed learning framework allows the humanoid robot to gradually walk straight in the automated learning process.

Download Full-text