scholarly journals Deep Reinforcement Learning for Stock Recommendation

2021 ◽  
Vol 2050 (1) ◽  
pp. 012012
Author(s):  
Yifei Shen ◽  
Tian Liu ◽  
Wenke Liu ◽  
Ruiqing Xu ◽  
Zhuo Li ◽  
...  

Abstract Recommending stocks is very important for investment companies and investors. However, without enough analysts, no stock selection strategy can capture the dynamics of all S&P 500 stocks. Nevertheless, most existing recommending strategies are based on predictive models to buy and hold stocks with high return potential. But these strategies fail to recommend stocks from different industrial sectors to reduce risks. In this article, we propose a novel solution that recommends a stock portfolio with reinforcement learning from the S&P 500 index. Our basic idea is to construct a stock relation graph (RG) which provide rich relations among stocks and industrial sectors, to generate diversified recommendation result. To this end, we design a new method to explore high-quality stocks from the constructed relation graph with reinforcement learning. Specifically, the reinforcement learning agent jumps from each industrial sector to select stock based on the feedback signals from the market. Finally, we apply portfolio allocation methods (i.e., mean-variance and minimum-variance) to test the validity of the recommendation. The empirical results show that the performance of portfolio allocation based on the selected stocks is better than the long-term strategy on the S&P 500 Index in terms of cumulative returns.

Author(s):  
Prahlad Koratamaddi ◽  
Karan Wadhwani ◽  
Mridul Gupta ◽  
Dr. Sriram G. Sanjeevi

2020 ◽  
Vol 34 (03) ◽  
pp. 2561-2568
Author(s):  
Morgane Ayle ◽  
Jimmy Tekli ◽  
Julia El-Zini ◽  
Boulos El-Asmar ◽  
Mariette Awad

Research has shown that deep neural networks are able to help and assist human workers throughout the industrial sector via different computer vision applications. However, such data-driven learning approaches require a very large number of labeled training images in order to generalize well and achieve high accuracies that meet industry standards. Gathering and labeling large amounts of images is both expensive and time consuming, specifically for industrial use-cases. In this work, we introduce BAR (Bounding-box Automated Refinement), a reinforcement learning agent that learns to correct inaccurate bounding-boxes that are weakly generated by certain detection methods, or wrongly annotated by a human, using either an offline training method with Deep Reinforcement Learning (BAR-DRL), or an online one using Contextual Bandits (BAR-CB). Our agent limits the human intervention to correcting or verifying a subset of bounding-boxes instead of re-drawing new ones. Results on a car industry-related dataset and on the PASCAL VOC dataset show a consistent increase of up to 0.28 in the Intersection-over-Union of bounding-boxes with their desired ground-truths, while saving 30%-82% of human intervention time in either correcting or re-drawing inaccurate proposals.


2020 ◽  
Vol 16 (9) ◽  
pp. 1674-1697
Author(s):  
O.P. Smirnova ◽  
A.O. Ponomareva

Subject. The article focuses on contemporary trends in the industrial and socio-economic development of Russia during the technological transformation of its sectors. Objectives. The study is an attempt to analyze what opportunities and difficulties may arise for the development of the industrial sectors in Russia. We also examine the dynamics of key development indicators of the industrial sectors, point out inhibitors of their competitiveness. Methods. The methodological framework comprises general methods of systems, structural-functional and comprehensive approaches to analyzing economic phenomena. We applied graphic, economic-statistical methods of research, conventional methods of grouping, comparison and generalization, and the logic, systems and statistical analysis. Results. We display how industrial sectors develop over time by type of economic activities. The article provides the rationale for structural rearrangements and further innovation-driven development of the industries. We display that the Russian industries technologically depend om imported production technologies. We substantiate the renewal of assets and technologies at industrial enterprises, and retain and develop human capital. Conclusions and Relevance. Primarily, the Russian economy should be digitalized as a source of the long-term economic growth. Notably, industrial enterprises should replace their linear production method with that of the circular economy and implement resource-saving innovative technologies. The State evidently acts as the leading driver of technological retrofitting of the industrial sector. If the State holds the reasonable and appropriate industrial policy at the federal and regional levels and configure its tools to ensure the modern approach to developing the industries in a competitive fashion, the industrial complex will successfully transform into the innovative economy.


Biomimetics ◽  
2021 ◽  
Vol 6 (1) ◽  
pp. 13
Author(s):  
Adam Bignold ◽  
Francisco Cruz ◽  
Richard Dazeley ◽  
Peter Vamplew ◽  
Cameron Foale

Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gain a sufficient sample size. In this regard, to require human interaction every time an experiment is restarted is undesirable, particularly when the expense in doing so can be considerable. Additionally, reusing the same people for the experiment introduces bias, as they will learn the behaviour of the agent and the dynamics of the environment. This paper presents a methodology for evaluating interactive reinforcement learning agents by employing simulated users. Simulated users allow human knowledge, bias, and interaction to be simulated. The use of simulated users allows the development and testing of reinforcement learning agents, and can provide indicative results of agent performance under defined human constraints. While simulated users are no replacement for actual humans, they do offer an affordable and fast alternative for evaluative assisted agents. We introduce a method for performing a preliminary evaluation utilising simulated users to show how performance changes depending on the type of user assisting the agent. Moreover, we describe how human interaction may be simulated, and present an experiment illustrating the applicability of simulating users in evaluating agent performance when assisted by different types of trainers. Experimental results show that the use of this methodology allows for greater insight into the performance of interactive reinforcement learning agents when advised by different users. The use of simulated users with varying characteristics allows for evaluation of the impact of those characteristics on the behaviour of the learning agent.


2021 ◽  
Vol 2 (1) ◽  
pp. 1-25
Author(s):  
Yongsen Ma ◽  
Sheheryar Arshad ◽  
Swetha Muniraju ◽  
Eric Torkildson ◽  
Enrico Rantala ◽  
...  

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.


Algorithms ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 26
Author(s):  
Yiran Xue ◽  
Rui Wu ◽  
Jiafeng Liu ◽  
Xianglong Tang

Existing crowd evacuation guidance systems require the manual design of models and input parameters, incurring a significant workload and a potential for errors. This paper proposed an end-to-end intelligent evacuation guidance method based on deep reinforcement learning, and designed an interactive simulation environment based on the social force model. The agent could automatically learn a scene model and path planning strategy with only scene images as input, and directly output dynamic signage information. Aiming to solve the “dimension disaster” phenomenon of the deep Q network (DQN) algorithm in crowd evacuation, this paper proposed a combined action-space DQN (CA-DQN) algorithm that grouped Q network output layer nodes according to action dimensions, which significantly reduced the network complexity and improved system practicality in complex scenes. In this paper, the evacuation guidance system is defined as a reinforcement learning agent and implemented by the CA-DQN method, which provides a novel approach for the evacuation guidance problem. The experiments demonstrate that the proposed method is superior to the static guidance method, and on par with the manually designed model method.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Pankaj Rajak ◽  
Aravind Krishnamoorthy ◽  
Ankit Mishra ◽  
Rajiv Kalia ◽  
Aiichiro Nakano ◽  
...  

AbstractPredictive materials synthesis is the primary bottleneck in realizing functional and quantum materials. Strategies for synthesis of promising materials are currently identified by time-consuming trial and error and there are no known predictive schemes to design synthesis parameters for materials. We use offline reinforcement learning (RL) to predict optimal synthesis schedules, i.e., a time-sequence of reaction conditions like temperatures and concentrations, for the synthesis of semiconducting monolayer MoS2 using chemical vapor deposition. The RL agent, trained on 10,000 computational synthesis simulations, learned threshold temperatures and chemical potentials for onset of chemical reactions and predicted previously unknown synthesis schedules that produce well-sulfidized crystalline, phase-pure MoS2. The model can be extended to multi-task objectives such as predicting profiles for synthesis of complex structures including multi-phase heterostructures and can predict long-time behavior of reacting systems, far beyond the domain of molecular dynamics simulations, making these predictions directly relevant to experimental synthesis.


Symmetry ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 631
Author(s):  
Chunyang Hu

In this paper, deep reinforcement learning (DRL) and knowledge transfer are used to achieve the effective control of the learning agent for the confrontation in the multi-agent systems. Firstly, a multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm with parameter sharing is proposed to achieve confrontation decision-making of multi-agent. In the process of training, the information of other agents is introduced to the critic network to improve the strategy of confrontation. The parameter sharing mechanism can reduce the loss of experience storage. In the DDPG algorithm, we use four neural networks to generate real-time action and Q-value function respectively and use a momentum mechanism to optimize the training process to accelerate the convergence rate for the neural network. Secondly, this paper introduces an auxiliary controller using a policy-based reinforcement learning (RL) method to achieve the assistant decision-making for the game agent. In addition, an effective reward function is used to help agents balance losses of enemies and our side. Furthermore, this paper also uses the knowledge transfer method to extend the learning model to more complex scenes and improve the generalization of the proposed confrontation model. Two confrontation decision-making experiments are designed to verify the effectiveness of the proposed method. In a small-scale task scenario, the trained agent can successfully learn to fight with the competitors and achieve a good winning rate. For large-scale confrontation scenarios, the knowledge transfer method can gradually improve the decision-making level of the learning agent.


2010 ◽  
Vol 13 (04) ◽  
pp. 621-645 ◽  
Author(s):  
Wen-Rong Jerry Ho ◽  
C. H. Liu ◽  
H. W. Chen

This research uses all of the listed electronic stocks in the Taiwan Stock Exchange as a sample to test the performance of the return rate of stock prices. In addition, this research compares it with the electronic stock returns. The empirical result shows that no matter which kind of stock selection strategy we choose, a majority of the return rate is higher than that of the electronics index. Evident in the results, the predicted effect of BPNN is better than that of the general average decentralized investment strategy. Furthermore, the low price-to-earning ratio and the low book-to-market ratio have a significant long-term influence.


Sign in / Sign up

Export Citation Format

Share Document