Adaptive Quantitative Trading: An Imitative Deep Reinforcement Learning Approach

Yang Liu; Qi Liu; Hongke Zhao; Zhen Pan; Chuanren Liu

doi:10.1609/aaai.v34i02.5587

Adaptive Quantitative Trading: An Imitative Deep Reinforcement Learning Approach

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i02.5587 ◽

2020 ◽

Vol 34 (02) ◽

pp. 2128-2135

Author(s):

Yang Liu ◽

Qi Liu ◽

Hongke Zhao ◽

Zhen Pan ◽

Chuanren Liu

Keyword(s):

Reinforcement Learning ◽

Trading Strategies ◽

Financial Data ◽

Imitation Learning ◽

Market Condition ◽

Exploration And Exploitation ◽

Markov Decision ◽

Trading Model ◽

Trading Agent ◽

Partially Observable

In recent years, considerable efforts have been devoted to developing AI techniques for finance research and applications. For instance, AI techniques (e.g., machine learning) can help traders in quantitative trading (QT) by automating two tasks: market condition recognition and trading strategies execution. However, existing methods in QT face challenges such as representing noisy high-frequent financial data and finding the balance between exploration and exploitation of the trading agent with AI techniques. To address the challenges, we propose an adaptive trading model, namely iRDPG, to automatically develop QT strategies by an intelligent trading agent. Our model is enhanced by deep reinforcement learning (DRL) and imitation learning techniques. Specifically, considering the noisy financial data, we formulate the QT process as a Partially Observable Markov Decision Process (POMDP). Also, we introduce imitation learning to leverage classical trading strategies useful to balance between exploration and exploitation. For better simulation, we train our trading agent in the real financial market using minute-frequent data. Experimental results demonstrate that our model can extract robust market features and be adaptive in different markets.

Download Full-text

Pseudo Random Number Generation through Reinforcement Learning and Recurrent Neural Networks

Algorithms ◽

10.3390/a13110307 ◽

2020 ◽

Vol 13 (11) ◽

pp. 307

Author(s):

Luca Pasqualini ◽

Maurizio Parton

Keyword(s):

Reinforcement Learning ◽

Random Number ◽

Short Term Memory ◽

Random Number Generator ◽

Random Number Generation ◽

Time Step ◽

Software Applications ◽

Pseudo Random Number ◽

Markov Decision ◽

Partially Observable

A Pseudo-Random Number Generator (PRNG) is any algorithm generating a sequence of numbers approximating properties of random numbers. These numbers are widely employed in mid-level cryptography and in software applications. Test suites are used to evaluate the quality of PRNGs by checking statistical properties of the generated sequences. These sequences are commonly represented bit by bit. This paper proposes a Reinforcement Learning (RL) approach to the task of generating PRNGs from scratch by learning a policy to solve a partially observable Markov Decision Process (MDP), where the full state is the period of the generated sequence, and the observation at each time-step is the last sequence of bits appended to such states. We use Long-Short Term Memory (LSTM) architecture to model the temporal relationship between observations at different time-steps by tasking the LSTM memory with the extraction of significant features of the hidden portion of the MDP’s states. We show that modeling a PRNG with a partially observable MDP and an LSTM architecture largely improves the results of the fully observable feedforward RL approach introduced in previous work.

Download Full-text

A pulse neural network reinforcement learning algorithm for partially observable Markov decision processes

Systems and Computers in Japan ◽

10.1002/scj.10645 ◽

2005 ◽

Vol 36 (3) ◽

pp. 42-52 ◽

Cited By ~ 3

Author(s):

Koichiro Takita ◽

Masafumi Hagiwara

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Markov Decision Processes ◽

Learning Algorithm ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Reinforcement Learning Algorithm

Download Full-text

Universal Reinforcement Learning Algorithms: Survey and Experiments

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/194 ◽

2017 ◽

Author(s):

John Aslanides ◽

Jan Leike ◽

Marcus Hutter

Keyword(s):

Reinforcement Learning ◽

Open Source ◽

Markov Decision Process ◽

Decision Process ◽

Empirical Investigation ◽

State Of The Art ◽

Learning Algorithms ◽

Markov Decision ◽

Reference Implementation ◽

Partially Observable

Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP). In contrast, the field of universal reinforcement learning (URL) is concerned with algorithms that make as few assumptions as possible about the environment. The universal Bayesian agent AIXI and a family of related URL algorithms have been developed in this setting. While numerous theoretical optimality results have been proven for these agents, there has been no empirical investigation of their behavior to date. We present a short and accessible survey of these URL algorithms under a unified notation and framework, along with results of some experiments that qualitatively illustrate some properties of the resulting policies, and their relative performance on partially-observable gridworld environments. We also present an open- source reference implementation of the algorithms which we hope will facilitate further understanding of, and experimentation with, these ideas.

Download Full-text

Contracts for Difference: A Reinforcement Learning Approach

Journal of Risk and Financial Management ◽

10.3390/jrfm13040078 ◽

2020 ◽

Vol 13 (4) ◽

pp. 78

Author(s):

Nico Zengeler ◽

Uwe Handmann

Keyword(s):

Reinforcement Learning ◽

Short Term Memory ◽

Learning Agents ◽

Learning Framework ◽

Learning Agent ◽

Markov Decision ◽

Economic Trends ◽

Model Size ◽

Contracts For Difference ◽

Partially Observable

We present a deep reinforcement learning framework for an automatic trading of contracts for difference (CfD) on indices at a high frequency. Our contribution proves that reinforcement learning agents with recurrent long short-term memory (LSTM) networks can learn from recent market history and outperform the market. Usually, these approaches depend on a low latency. In a real-world example, we show that an increased model size may compensate for a higher latency. As the noisy nature of economic trends complicates predictions, especially in speculative assets, our approach does not predict courses but instead uses a reinforcement learning agent to learn an overall lucrative trading policy. Therefore, we simulate a virtual market environment, based on historical trading data. Our environment provides a partially observable Markov decision process (POMDP) to reinforcement learners and allows the training of various strategies.

Download Full-text

CHQ: a multi-agent reinforcement learning scheme for partially observable markov decision processes

Proceedings. IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2004. (IAT 2004). ◽

10.1109/iat.2004.1342918 ◽

2004 ◽

Author(s):

H. Osada ◽

S. Fujita

Keyword(s):

Reinforcement Learning ◽

Markov Decision Processes ◽

Decision Processes ◽

Learning Scheme ◽

Markov Decision ◽

Multi Agent ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Mixed Reinforcement Learning for Partially Observable Markov Decision Process

2007 International Symposium on Computational Intelligence in Robotics and Automation ◽

10.1109/cira.2007.382910 ◽

2007 ◽

Cited By ~ 1

Author(s):

Le Tien Dung ◽

Takashi Komeda ◽

Motoki Takagi

Keyword(s):

Reinforcement Learning ◽

Markov Decision Process ◽

Decision Process ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes

FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315) ◽

10.1109/fuzzy.1999.793014 ◽

1999 ◽

Cited By ~ 10

Author(s):

H.R. Berenji ◽

D. Vengerov

Keyword(s):

Reinforcement Learning ◽

Markov Decision Processes ◽

Decision Processes ◽

Learning Agents ◽

Continuous State ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Recent Advances in Deep Reinforcement Learning Applications for Solving Partially Observable Markov Decision Processes (POMDP) Problems: Part 1—Fundamentals and Applications in Games, Robotics and Natural Language Processing

Machine Learning and Knowledge Extraction ◽

10.3390/make3030029 ◽

2021 ◽

Vol 3 (3) ◽

pp. 554-581

Author(s):

Xuanchen Xiang ◽

Simon Foo

Keyword(s):

Natural Language Processing ◽

Reinforcement Learning ◽

Natural Language ◽

Markov Decision Processes ◽

Language Processing ◽

Decision Processes ◽

Recent Advances ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

The first part of a two-part series of papers provides a survey on recent advances in Deep Reinforcement Learning (DRL) applications for solving partially observable Markov decision processes (POMDP) problems. Reinforcement Learning (RL) is an approach to simulate the human’s natural learning process, whose key is to let the agent learn by interacting with the stochastic environment. The fact that the agent has limited access to the information of the environment enables AI to be applied efficiently in most fields that require self-learning. Although efficient algorithms are being widely used, it seems essential to have an organized investigation—we can make good comparisons and choose the best structures or algorithms when applying DRL in various applications. In this overview, we introduce Markov Decision Processes (MDP) problems and Reinforcement Learning and applications of DRL for solving POMDP problems in games, robotics, and natural language processing. A follow-up paper will cover applications in transportation, communications and networking, and industries.

Download Full-text

CHQ: A Multi-Agent Reinforcement Learning Scheme for Partially Observable Markov Decision Processes

IEICE Transactions on Information and Systems ◽

10.1093/ietisy/e88-d.5.1004 ◽

2005 ◽

Vol E88-D (5) ◽

pp. 1004-1011 ◽

Cited By ~ 3

Author(s):

H. OSADA

Keyword(s):

Reinforcement Learning ◽

Markov Decision Processes ◽

Decision Processes ◽

Learning Scheme ◽

Markov Decision ◽

Multi Agent ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Guided Soft Actor Critic: A Guided Deep Reinforcement Learning Approach for Partially Observable Markov Decision Processes

IEEE Access ◽

10.1109/access.2021.3131772 ◽

2021 ◽

pp. 1-1

Author(s):

Mehmet Haklidir ◽

Hakan Temeltas

Keyword(s):

Reinforcement Learning ◽

Markov Decision Processes ◽

Decision Processes ◽

Learning Approach ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text