Reinforcement Learning Algorithm with CTRNN in Continuous Action Space

Author(s):  
Hiroaki Arie ◽  
Jun Namikawa ◽  
Tetsuya Ogata ◽  
Jun Tani ◽  
Shigeki Sugano
2021 ◽  
Vol 11 (20) ◽  
pp. 9367
Author(s):  
Usman Ahmad Usmani ◽  
Junzo Watada ◽  
Jafreezal Jaafar ◽  
Izzatdin Abdul Aziz ◽  
Arunava Roy

Skin cancers are increasing at an alarming rate, and detection in the early stages is essential for advanced treatment. The current segmentation methods have limited labeling ability to the ground truth images due to the numerous noisy expert annotations present in the datasets. The precise boundary segmentation is essential to correctly locate and diagnose the various skin lesions. In this work, the lesion segmentation method is proposed as a Markov decision process. It is solved by training an agent to segment the region using a deep reinforcement-learning algorithm. Our method is similar to the delineation of a region of interest by the physicians. The agent follows a set of serial actions for the region delineation, and the action space is defined as a set of continuous action parameters. The segmentation model learns in continuous action space using the deep deterministic policy gradient algorithm. The proposed method enables continuous improvement in performance as we proceed from coarse segmentation results to finer results. Finally, our proposed model is evaluated on the International Skin Imaging Collaboration (ISIC) 2017 image dataset, Human against Machine (HAM10000), and PH2 dataset. On the ISIC 2017 dataset, the algorithm achieves an accuracy of 96.33% for the naevus cases, 95.39% for the melanoma cases, and 94.27% for the seborrheic keratosis cases. The other metrics are evaluated on these datasets and rank higher when compared with the current state-of-the-art lesion segmentation algorithms.


Author(s):  
Yuntao Han ◽  
Qibin Zhou ◽  
Fuqing Duan

AbstractThe digital curling game is a two-player zero-sum extensive game in a continuous action space. There are some challenging problems that are still not solved well, such as the uncertainty of strategy, the large game tree searching, and the use of large amounts of supervised data, etc. In this work, we combine NFSP and KR-UCT for digital curling games, where NFSP uses two adversary learning networks and can automatically produce supervised data, and KR-UCT can be used for large game tree searching in continuous action space. We propose two reward mechanisms to make reinforcement learning converge quickly. Experimental results validate the proposed method, and show the strategy model can reach the Nash equilibrium.


2021 ◽  
Vol 54 (3-4) ◽  
pp. 417-428
Author(s):  
Yanyan Dai ◽  
KiDong Lee ◽  
SukGyu Lee

For real applications, rotary inverted pendulum systems have been known as the basic model in nonlinear control systems. If researchers have no deep understanding of control, it is difficult to control a rotary inverted pendulum platform using classic control engineering models, as shown in section 2.1. Therefore, without classic control theory, this paper controls the platform by training and testing reinforcement learning algorithm. Many recent achievements in reinforcement learning (RL) have become possible, but there is a lack of research to quickly test high-frequency RL algorithms using real hardware environment. In this paper, we propose a real-time Hardware-in-the-loop (HIL) control system to train and test the deep reinforcement learning algorithm from simulation to real hardware implementation. The Double Deep Q-Network (DDQN) with prioritized experience replay reinforcement learning algorithm, without a deep understanding of classical control engineering, is used to implement the agent. For the real experiment, to swing up the rotary inverted pendulum and make the pendulum smoothly move, we define 21 actions to swing up and balance the pendulum. Comparing Deep Q-Network (DQN), the DDQN with prioritized experience replay algorithm removes the overestimate of Q value and decreases the training time. Finally, this paper shows the experiment results with comparisons of classic control theory and different reinforcement learning algorithms.


Sign in / Sign up

Export Citation Format

Share Document