MAPS: Multi-Agent reinforcement learning-based Portfolio management System.

Generating an investment strategy using advanced deep learning methods in stock markets has recently been a topic of interest. Most existing deep learning methods focus on proposing an optimal model or network architecture by maximizing return. However, these models often fail to consider and adapt to the continuously changing market conditions. In this paper, we propose the Multi-Agent reinforcement learning-based Portfolio management System (MAPS). MAPS is a cooperative system in which each agent is an independent "investor" creating its own portfolio. In the training procedure, each agent is guided to act as diversely as possible while maximizing its own return with a carefully designed loss function. As a result, MAPS as a system ends up with a diversified portfolio. Experiment results with 12 years of US market data show that MAPS outperforms most of the baselines in terms of Sharpe ratio. Furthermore, our results show that adding more agents to our system would allow us to get a higher Sharpe ratio by lowering risk with a more diversified portfolio.

Download Full-text

Portfolio management system in equity market neutral using reinforcement learning

Applied Intelligence ◽

10.1007/s10489-021-02262-0 ◽

2021 ◽

Author(s):

Mu-En Wu ◽

Jia-Hao Syu ◽

Jerry Chun-Wei Lin ◽

Jan-Ming Ho

Keyword(s):

Resource Allocation ◽

Reinforcement Learning ◽

Portfolio Management ◽

Stock Prices ◽

Management System ◽

Sharpe Ratio ◽

Reward Function ◽

Portfolio Strategies ◽

Sharpe Ratios ◽

Almost All

AbstractPortfolio management involves position sizing and resource allocation. Traditional and generic portfolio strategies require forecasting of future stock prices as model inputs, which is not a trivial task since those values are difficult to obtain in the real-world applications. To overcome the above limitations and provide a better solution for portfolio management, we developed a Portfolio Management System (PMS) using reinforcement learning with two neural networks (CNN and RNN). A novel reward function involving Sharpe ratios is also proposed to evaluate the performance of the developed systems. Experimental results indicate that the PMS with the Sharpe ratio reward function exhibits outstanding performance, increasing return by 39.0% and decreasing drawdown by 13.7% on average compared to the reward function of trading return. In addition, the proposed model is more suitable for the construction of a reinforcement learning portfolio, but has 1.98 times more drawdown risk than the . Among the conducted datasets, the PMS outperforms the benchmark strategies in TW50 and traditional stocks, but is inferior to a benchmark strategy in the financial dataset. The PMS is profitable, effective, and offers lower investment risk among almost all datasets. The novel reward function involving the Sharpe ratio enhances performance, and well supports resource-allocation for empirical stock trading.

Download Full-text

Portfolio Management System with Reinforcement Learning

2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC) ◽

10.1109/smc42975.2020.9283359 ◽

2020 ◽

Author(s):

Jia-Hao Syu ◽

Mu-En Wu ◽

Jan-Ming Ho

Keyword(s):

Reinforcement Learning ◽

Portfolio Management ◽

Management System

Download Full-text

A Survey on Multi-Agent Reinforcement Learning Methods for Vehicular Networks

2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC) ◽

10.1109/iwcmc.2019.8766739 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ibrahim Althamary ◽

Chih-Wei Huang ◽

Phone Lin

Keyword(s):

Reinforcement Learning ◽

Vehicular Networks ◽

Learning Methods ◽

Multi Agent

Download Full-text

Efficient Multilevel Federated Compressed Reinforcement Learning of Smart Homes Using Deep Learning Methods

10.1109/icses52305.2021.9633785 ◽

2021 ◽

Author(s):

P. Ravichandran ◽

C. Saravanakumar ◽

J. Dafni Rose ◽

M. Vijayakumar ◽

V. Muthu Lakshmi

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Smart Homes ◽

Learning Methods

Download Full-text

I-TRUST: investigating trust between users and agents in a multi-agent portfolio management system

Electronic Commerce Research and Applications ◽

10.1016/s1567-4223(03)00039-5 ◽

2003 ◽

Vol 2 (4) ◽

pp. 302-314 ◽

Cited By ~ 11

Author(s):

Tiffany Y. Tang ◽

Pinata Winoto ◽

Xiaolin Niu

Keyword(s):

Portfolio Management ◽

Management System ◽

Multi Agent

Download Full-text

A Survey of Forex and Stock Price Prediction Using Deep Learning

Applied System Innovation ◽

10.3390/asi4010009 ◽

2021 ◽

Vol 4 (1) ◽

pp. 9 ◽

Cited By ~ 4

Author(s):

Zexin Hu ◽

Yiqi Zhao ◽

Matloob Khushi

Keyword(s):

Neural Network ◽

Deep Learning ◽

Reinforcement Learning ◽

Mean Square Error ◽

Performance Metrics ◽

Short Term Memory ◽

Percentage Error ◽

Mean Square ◽

Learning Methods ◽

Stock Price Prediction

Predictions of stock and foreign exchange (Forex) have always been a hot and profitable area of study. Deep learning applications have been proven to yield better accuracy and return in the field of financial prediction and forecasting. In this survey, we selected papers from the Digital Bibliography & Library Project (DBLP) database for comparison and analysis. We classified papers according to different deep learning methods, which included Convolutional neural network (CNN); Long Short-Term Memory (LSTM); Deep neural network (DNN); Recurrent Neural Network (RNN); Reinforcement Learning; and other deep learning methods such as Hybrid Attention Networks (HAN), self-paced learning mechanism (NLP), and Wavenet. Furthermore, this paper reviews the dataset, variable, model, and results of each article. The survey used presents the results through the most used performance metrics: Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), Mean Square Error (MSE), accuracy, Sharpe ratio, and return rate. We identified that recent models combining LSTM with other methods, for example, DNN, are widely researched. Reinforcement learning and other deep learning methods yielded great returns and performances. We conclude that, in recent years, the trend of using deep-learning-based methods for financial modeling is rising exponentially.

Download Full-text

Multi-agent reinforcement learning for character control

The Visual Computer ◽

10.1007/s00371-021-02269-1 ◽

2021 ◽

Author(s):

Cheng Li ◽

Levi Fussell ◽

Taku Komura

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Computer Games ◽

State Of The Art ◽

Research Topic ◽

Future Directions ◽

Computer Animations ◽

Survey Papers ◽

Simultaneous Control ◽

Multi Agent

AbstractSimultaneous control of multiple characters has been a research topic that has been extensively pursued for applications in computer games and computer animations, for applications such as crowd simulation, controlling two characters carrying objects or fighting with one another and controlling a team of characters playing collective sports. With the advance in deep learning and reinforcement learning, there is a growing interest in applying multi-agent reinforcement learning for intelligently controlling the characters to produce realistic movements. In this paper we will survey the state-of-the-art MARL techniques that are applicable for character control. We will then survey papers that make use of MARL for multi-character control and then discuss about the possible future directions of research.

Download Full-text

Edge-Sensitive Left Ventricle Segmentation Using Deep Reinforcement Learning

Sensors ◽

10.3390/s21072375 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2375

Author(s):

Jingjing Xiong ◽

Lai-Man Po ◽

Kwok Wai Cheung ◽

Pengfei Xian ◽

Yuzhi Zhao ◽

...

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Left Ventricle ◽

Autonomous Driving ◽

Learning Methods ◽

Proposed Model ◽

Markov Decision ◽

Edge Points ◽

Cardiac Diagnosis ◽

Ventricle Segmentation

Deep reinforcement learning (DRL) has been utilized in numerous computer vision tasks, such as object detection, autonomous driving, etc. However, relatively few DRL methods have been proposed in the area of image segmentation, particularly in left ventricle segmentation. Reinforcement learning-based methods in earlier works often rely on learning proper thresholds to perform segmentation, and the segmentation results are inaccurate due to the sensitivity of the threshold. To tackle this problem, a novel DRL agent is designed to imitate the human process to perform LV segmentation. For this purpose, we formulate the segmentation problem as a Markov decision process and innovatively optimize it through DRL. The proposed DRL agent consists of two neural networks, i.e., First-P-Net and Next-P-Net. The First-P-Net locates the initial edge point, and the Next-P-Net locates the remaining edge points successively and ultimately obtains a closed segmentation result. The experimental results show that the proposed model has outperformed the previous reinforcement learning methods and achieved comparable performances compared with deep learning baselines on two widely used LV endocardium segmentation datasets, namely Automated Cardiac Diagnosis Challenge (ACDC) 2017 dataset, and Sunnybrook 2009 dataset. Moreover, the proposed model achieves higher F-measure accuracy compared with deep learning methods when training with a very limited number of samples.

Download Full-text

Comparison Between Reinforcement Learning Methods with Different Goal Selections in Multi-Agent Cooperation

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2017.p0917 ◽

2017 ◽

Vol 21 (5) ◽

pp. 917-929 ◽

Cited By ~ 2

Author(s):

Fumito Uwano ◽

◽

Keiki Takadama

Keyword(s):

Reinforcement Learning ◽

Learning Process ◽

Cooperative Behavior ◽

Learning Methods ◽

Q Learning ◽

Designed Experiments ◽

Multi Agent ◽

Agent Cooperation ◽

Maze Problem

This study discusses important factors for zero communication, multi-agent cooperation by comparing different modified reinforcement learning methods. The two learning methods used for comparison were assigned different goal selections for multi-agent cooperation tasks. The first method is called Profit Minimizing Reinforcement Learning (PMRL); it forces agents to learn how to reach the farthest goal, and then the agent closest to the goal is directed to the goal. The second method is called Yielding Action Reinforcement Learning (YARL); it forces agents to learn through a Q-learning process, and if the agents have a conflict, the agent that is closest to the goal learns to reach the next closest goal. To compare the two methods, we designed experiments by adjusting the following maze factors: (1) the location of the start point and goal; (2) the number of agents; and (3) the size of maze. The intensive simulations performed on the maze problem for the agent cooperation task revealed that the two methods successfully enabled the agents to exhibit cooperative behavior, even if the size of the maze and the number of agents change. The PMRL mechanism always enables the agents to learn cooperative behavior, whereas the YARL mechanism makes the agents learn cooperative behavior over a small number of learning iterations. In zero communication, multi-agent cooperation, it is important that only agents that have a conflict cooperate with each other.

Download Full-text

Keeping in Touch with Collaborative UAVs: A Deep Reinforcement Learning Approach

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/78 ◽

2018 ◽

Cited By ~ 3

Author(s):

Bo Yang ◽

Min Liu

Keyword(s):

Reinforcement Learning ◽

Training Procedure ◽

Message Delivery ◽

Uncertain Environments ◽

State Action ◽

Network Connection ◽

Agent Learning ◽

Multi Agent ◽

Continuous Domains ◽

Action Spaces

Effective collaborations among autonomous unmanned aerial vehicles (UAVs) rely on timely information sharing. However, the time-varying flight environment and the intermittent link connectivity pose great challenges to message delivery. In this paper, we leverage the deep reinforcement learning (DRL) technique to address the UAVs' optimal links discovery and selection problem in uncertain environments. As the multi-agent learning efficiency is constrained by the high-dimensional and continuous action spaces, we slice the whole action spaces into a number of tractable fractions to achieve efficient convergences of optimal policies in continuous domains. Moreover, for the nonstationarity issue that particularly challenges the multi-agent DRL with local perceptions, we present a multi-agent mutual sampling method that jointly interacts the intra-agent and inter-agent state-action information to stabilize and expedite the training procedure. We evaluate the proposed algorithm on the UAVs' continuous network connection task. Results show that the associated UAVs can quickly select the optimal connected links, which facilitate the UAVs' teamwork significantly.

Download Full-text