Deep Reinforcement Learning and Genetic Algorithm for a Pairs Trading Task on commodities

2020 ◽  
Author(s):  
Georgios Sermpinis ◽  
Charalampos Stasinakis ◽  
Xiangyu Zong
Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 471
Author(s):  
Jai Hoon Park ◽  
Kang Hoon Lee

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.


2021 ◽  
Vol 01 ◽  
Author(s):  
Ying Li ◽  
Chubing Guo ◽  
Jianshe Wu ◽  
Xin Zhang ◽  
Jian Gao ◽  
...  

Background: Unmanned systems have been widely used in multiple fields. Many algorithms have been proposed to solve path planning problems. Each algorithm has its advantages and defects and cannot adapt to all kinds of requirements. An appropriate path planning method is needed for various applications. Objective: To select an appropriate algorithm fastly in a given application. This could be helpful for improving the efficiency of path planning for Unmanned systems. Methods: This paper proposes to represent and quantify the features of algorithms based on the physical indicators of results. At the same time, an algorithmic collaborative scheme is developed to search the appropriate algorithm according to the requirement of the application. As an illustration of the scheme, four algorithms, including the A-star (A*) algorithm, reinforcement learning, genetic algorithm, and ant colony optimization algorithm, are implemented in the representation of their features. Results: In different simulations, the algorithmic collaborative scheme can select an appropriate algorithm in a given application based on the representation of algorithms. And the algorithm could plan a feasible and effective path. Conclusion: An algorithmic collaborative scheme is proposed, which is based on the representation of algorithms and requirement of the application. The simulation results prove the feasibility of the scheme and the representation of algorithms.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-20 ◽  
Author(s):  
Taewook Kim ◽  
Ha Young Kim

Many researchers have tried to optimize pairs trading as the numbers of opportunities for arbitrage profit have gradually decreased. Pairs trading is a market-neutral strategy; it profits if the given condition is satisfied within a given trading window, and if not, there is a risk of loss. In this study, we propose an optimized pairs-trading strategy using deep reinforcement learning—particularly with the deep Q-network—utilizing various trading and stop-loss boundaries. More specifically, if spreads hit trading thresholds and reverse to the mean, the agent receives a positive reward. However, if spreads hit stop-loss thresholds or fail to reverse to the mean after hitting the trading thresholds, the agent receives a negative reward. The agent is trained to select the optimum level of discretized trading and stop-loss boundaries given a spread to maximize the expected sum of discounted future profits. Pairs are selected from stocks on the S&P 500 Index using a cointegration test. We compared our proposed method with traditional pairs-trading strategies which use constant trading and stop-loss boundaries. We find that our proposed model is trained well and outperforms traditional pairs-trading strategies.


1995 ◽  
Vol 3 (2) ◽  
pp. 149-175 ◽  
Author(s):  
Stewart W. Wilson

In many classifier systems, the classifier strength parameter serves as a predictor of future payoff and as the classifier's fitness for the genetic algorithm. We investigate a classifier system, XCS, in which each classifier maintains a prediction of expected payoff, but the classifier's fitness is given by a measure of the prediction's accuracy. The system executes the genetic algorithm in niches defined by the match sets, instead of panmictically. These aspects of XCS result in its population tending to form a complete and accurate mapping X × A → P from inputs and actions to payoff predictions. Further, XCS tends to evolve classifiers that are maximally general, subject to an accuracy criterion. Besides introducing a new direction for classifier system research, these properties of XCS make it suitable for a wide range of reinforcement learning situations where generalization over states is desirable.


2016 ◽  
Vol 20 (12) ◽  
pp. 5051-5066 ◽  
Author(s):  
Saeid Fallahpour ◽  
Hasan Hakimian ◽  
Khalil Taheri ◽  
Ehsan Ramezanifar

Author(s):  
Joao Paulo Queiroz dos Santos ◽  
Francisco Chagas de Lima ◽  
Rafael Marrocos Magalhaes ◽  
Jorge Dantas de Melo ◽  
Adriao Duarte Doria Neto

Sign in / Sign up

Export Citation Format

Share Document