Deep Reinforcement Learning and Genetic Algorithm for a Pairs Trading Task on commodities

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.

Download Full-text

Optimization of Multilayer Optical Films with Unsupervised Learning, reinforcement learning and genetic algorithm

Frontiers in Optics / Laser Science ◽

10.1364/fio.2020.jm6a.5 ◽

2020 ◽

Author(s):

Jiang Anqing ◽

Osamu Yoshie

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Unsupervised Learning ◽

Optical Films ◽

Learning Reinforcement

Download Full-text

Collaborative Scheduling of Algorithms for Path Planning of Unmanned Systems

Current Chinese Science ◽

10.2174/2210298101666210211094253 ◽

2021 ◽

Vol 01 ◽

Author(s):

Ying Li ◽

Chubing Guo ◽

Jianshe Wu ◽

Xin Zhang ◽

Jian Gao ◽

...

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Path Planning ◽

Unmanned Systems ◽

Ant Colony Optimization Algorithm ◽

A Algorithm ◽

Collaborative Scheduling ◽

Simulation Results ◽

Planning Problems ◽

Effective Path

Background: Unmanned systems have been widely used in multiple fields. Many algorithms have been proposed to solve path planning problems. Each algorithm has its advantages and defects and cannot adapt to all kinds of requirements. An appropriate path planning method is needed for various applications. Objective: To select an appropriate algorithm fastly in a given application. This could be helpful for improving the efficiency of path planning for Unmanned systems. Methods: This paper proposes to represent and quantify the features of algorithms based on the physical indicators of results. At the same time, an algorithmic collaborative scheme is developed to search the appropriate algorithm according to the requirement of the application. As an illustration of the scheme, four algorithms, including the A-star (A*) algorithm, reinforcement learning, genetic algorithm, and ant colony optimization algorithm, are implemented in the representation of their features. Results: In different simulations, the algorithmic collaborative scheme can select an appropriate algorithm in a given application based on the representation of algorithms. And the algorithm could plan a feasible and effective path. Conclusion: An algorithmic collaborative scheme is proposed, which is based on the representation of algorithms and requirement of the application. The simulation results prove the feasibility of the scheme and the representation of algorithms.

Download Full-text

Improving Pairs Trading Strategies via Reinforcement Learning

2021 International Conference on Applied Artificial Intelligence (ICAPAI) ◽

10.1109/icapai49758.2021.9462067 ◽

2021 ◽

Author(s):

Cheng Wang ◽

Patrik Sandas ◽

Peter Beling

Keyword(s):

Reinforcement Learning ◽

Trading Strategies ◽

Pairs Trading

Download Full-text

Optimizing the Pairs-Trading Strategy Using Deep Reinforcement Learning with Trading and Stop-Loss Boundaries

Complexity ◽

10.1155/2019/3582516 ◽

2019 ◽

Vol 2019 ◽

pp. 1-20 ◽

Cited By ~ 1

Author(s):

Taewook Kim ◽

Ha Young Kim

Keyword(s):

Reinforcement Learning ◽

Trading Strategy ◽

Trading Strategies ◽

Cointegration Test ◽

Optimum Level ◽

Pairs Trading ◽

Proposed Model ◽

The Mean ◽

Stop Loss ◽

The Given

Many researchers have tried to optimize pairs trading as the numbers of opportunities for arbitrage profit have gradually decreased. Pairs trading is a market-neutral strategy; it profits if the given condition is satisfied within a given trading window, and if not, there is a risk of loss. In this study, we propose an optimized pairs-trading strategy using deep reinforcement learning—particularly with the deep Q-network—utilizing various trading and stop-loss boundaries. More specifically, if spreads hit trading thresholds and reverse to the mean, the agent receives a positive reward. However, if spreads hit stop-loss thresholds or fail to reverse to the mean after hitting the trading thresholds, the agent receives a negative reward. The agent is trained to select the optimum level of discretized trading and stop-loss boundaries given a spread to maximize the expected sum of discounted future profits. Pairs are selected from stocks on the S&P 500 Index using a cointegration test. We compared our proposed method with traditional pairs-trading strategies which use constant trading and stop-loss boundaries. We find that our proposed model is trained well and outperforms traditional pairs-trading strategies.

Download Full-text

Classifier Fitness Based on Accuracy

Evolutionary Computation ◽

10.1162/evco.1995.3.2.149 ◽

1995 ◽

Vol 3 (2) ◽

pp. 149-175 ◽

Cited By ~ 837

Author(s):

Stewart W. Wilson

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Strength Parameter ◽

System Research ◽

Classifier Systems ◽

Expected Payoff ◽

Classifier System ◽

Wide Range ◽

Accuracy Criterion

In many classifier systems, the classifier strength parameter serves as a predictor of future payoff and as the classifier's fitness for the genetic algorithm. We investigate a classifier system, XCS, in which each classifier maintains a prediction of expected payoff, but the classifier's fitness is given by a measure of the prediction's accuracy. The system executes the genetic algorithm in niches defined by the match sets, instead of panmictically. These aspects of XCS result in its population tending to form a complete and accurate mapping X × A → P from inputs and actions to payoff predictions. Further, XCS tends to evolve classifiers that are maximally general, subject to an accuracy criterion. Besides introducing a new direction for classifier system research, these properties of XCS make it suitable for a wide range of reinforcement learning situations where generalization over states is desirable.

Download Full-text