Automatic Data Augmentation by Upper Confidence Bounds for Deep Reinforcement Learning

Author(s):  
Yoonhee Gil ◽  
Jongchan Baek ◽  
Jonghyuk Park ◽  
Soohee Han
2021 ◽  
Vol 30 ◽  
pp. 8483-8496
Author(s):  
Yi Tang ◽  
Baopu Li ◽  
Min Liu ◽  
Boyu Chen ◽  
Yaonan Wang ◽  
...  

Author(s):  
Yonggang Li ◽  
Guosheng Hu ◽  
Yongtao Wang ◽  
Timothy Hospedales ◽  
Neil M. Robertson ◽  
...  

2021 ◽  
pp. 224-235
Author(s):  
Alexandra Rak ◽  
Alexey Skrynnik ◽  
Aleksandr I. Panov

2021 ◽  
Vol 11 (12) ◽  
pp. 5586
Author(s):  
Eunkyeong Kim ◽  
Jinyong Kim ◽  
Hansoo Lee ◽  
Sungshin Kim

Artificial intelligence technologies and robot vision systems are core technologies in smart factories. Currently, there is scholarly interest in automatic data feature extraction in smart factories using deep learning networks. However, sufficient training data are required to train these networks. In addition, barely perceptible noise can affect classification accuracy. Therefore, to increase the amount of training data and achieve robustness against noise attacks, a data augmentation method implemented using the adaptive inverse peak signal-to-noise ratio was developed in this study to consider the influence of the color characteristics of the training images. This method was used to automatically determine the optimal perturbation range of the color perturbation method for generating images using weights based on the characteristics of the training images. The experimental results showed that the proposed method could generate new training images from original images, classify noisy images with greater accuracy, and generally improve the classification accuracy. This demonstrates that the proposed method is effective and robust to noise, even when the training data are deficient.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1384
Author(s):  
Yuyu Yuan ◽  
Wen Wen ◽  
Jincui Yang

In algorithmic trading, adequate training data set is key to making profits. However, stock trading data in units of a day can not meet the great demand for reinforcement learning. To address this problem, we proposed a framework named data augmentation based reinforcement learning (DARL) which uses minute-candle data (open, high, low, close) to train the agent. The agent is then used to guide daily stock trading. In this way, we can increase the instances of data available for training in hundreds of folds, which can substantially improve the reinforcement learning effect. But not all stocks are suitable for this kind of trading. Therefore, we propose an access mechanism based on skewness and kurtosis to select stocks that can be traded properly using this algorithm. In our experiment, we find proximal policy optimization (PPO) is the most stable algorithm to achieve high risk-adjusted returns. Deep Q-learning (DQN) and soft actor critic (SAC) can beat the market in Sharp Ratio.


Sign in / Sign up

Export Citation Format

Share Document