scholarly journals Can we imitate stock price behavior to reinforcement learn option price?

Author(s):  
Xin Jin

This paper presents a framework of imitating the price behavior of the underlying stock for reinforcement learning option price. We use accessible features of the equities pricing data to construct a non-deterministic Markov decision process for modeling stock price behavior driven by principal investor's decision making. However, low signal-to-noise ratio and instability that appear immanent in equity markets pose challenges to determine the state transition (price change) after executing an action (principal investor's decision) as well as decide an action based on current state (spot price). In order to conquer these challenges, we resort to a Bayesian deep neural network for computing the predictive distribution of the state transition led by an action. Additionally, instead of exploring a state-action relationship to formulate a policy, we seek for an episode based visible-hidden state-action relationship to probabilistically imitate principal investor's successive decision making. Our algorithm then maps imitative principal investor's decisions to simulated stock price paths by a Bayesian deep neural network. Eventually the optimal option price is reinforcement learned through maximizing the cumulative risk-adjusted return of a dynamically hedged portfolio over simulated price paths of the underlying.

2021 ◽  
Author(s):  
Xin Jin

This paper presents a framework of imitating the price behavior of the underlying stock for reinforcement learning option price. We use accessible features of the equities pricing data to construct a non-deterministic Markov decision process for modeling stock price behavior driven by principal investor's decision making. However, low signal-to-noise ratio and instability that appear immanent in equity markets pose challenges to determine the state transition (price change) after executing an action (principal investor's decision) as well as decide an action based on current state (spot price). In order to conquer these challenges, we resort to a Bayesian deep neural network for computing the predictive distribution of the state transition led by an action. Additionally, instead of exploring a state-action relationship to formulate a policy, we seek for an episode based visible-hidden state-action relationship to probabilistically imitate principal investor's successive decision making. Our algorithm then maps imitative principal investor's decisions to simulated stock price paths by a Bayesian deep neural network. Eventually the optimal option price is reinforcement learned through maximizing the cumulative risk-adjusted return of a dynamically hedged portfolio over simulated price paths of the underlying.


2019 ◽  
Vol 8 (2) ◽  
pp. 3231-3241

The non-deterministic behavior of stock market creates ambiguities for buyers. The situation of ambiguities always finds the loss of user financial assets. The variations of price make a very difficult task to predict the option price. For the prediction of option used various non-parametric models such as artificial neural network, machine learning, and deep neural network. The accuracy of prediction is always a challenging task of for individual model and hybrid model. The variation gap of hypothesis value and predicted value reflects the nature of stock market. In this paper use the bagging method of machine learning for the prediction of option price. The bagging process merge different machine learning algorithm and reduce the variation gap of stock price.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Xiali Li ◽  
Zhengyu Lv ◽  
Licheng Wu ◽  
Yue Zhao ◽  
Xiaona Xu

In this study, hybrid state-action-reward-state-action (SARSAλ) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning strategy that uses SARSAλ and Q-learning algorithms combining domain knowledge for a feedback function for layout and battle stages is proposed. An improved deep neural network based on ResNet18 is used for self-play training. Experimental results show that hybrid online and offline reinforcement learning with a deep neural network can improve the game program’s learning efficiency and understanding ability for Tibetan Jiu chess.


Author(s):  
Hongrui Zhao ◽  
Jin Yu ◽  
Yanan Li ◽  
Donghui Wang ◽  
Jie Liu ◽  
...  

Nowadays, both online shopping and video sharing have grown exponentially. Although internet celebrities in videos are ideal exhibition for fashion corporations to sell their products, audiences do not always know where to buy fashion products in videos, which is a cross-domain problem called video-to-shop. In this paper, we propose a novel deep neural network, called Detect, Pick, and Retrieval Network (DPRNet), to break the gap between fashion products from videos and audiences. For the video side, we have modified the traditional object detector, which automatically picks out the best object proposals for every commodity in videos without duplication, to promote the performance of the video-to-shop task. For the fashion retrieval side, a simple but effective multi-task loss network obtains new state-of-the-art results on DeepFashion. Extensive experiments conducted on a new large-scale cross-domain video-to-shop dataset shows that DPRNet is efficient and outperforms the state-of-the-art methods on video-to-shop task.


2021 ◽  
Author(s):  
Yida Xin ◽  
Henry Lieberman ◽  
Peter Chin

Syntactic parsing technologies have become significantly more robust thanks to advancements in their underlying statistical and Deep Neural Network (DNN) techniques: most modern syntactic parsers can produce a syntactic parse tree for almost any sentence, including ones that may not be strictly grammatical. Despite improved robustness, such parsers still do not reflect the alternatives in parsing that are intrinsic in syntactic ambiguities. Two most notable such ambiguities are prepositional phrase (PP) attachment ambiguities and pronoun coreference ambiguities. In this paper, we discuss PatchComm, which uses commonsense knowledge to help resolve both kinds of ambiguities. To the best of our knowledge, we are the first to propose the general-purpose approach of using external commonsense knowledge bases to guide syntactic parsers. We evaluated PatchComm against the state-of-the-art (SOTA) spaCy parser on a PP attachment task and against the SOTA NeuralCoref module on a coreference task. Results show that PatchComm is successful at detecting syntactic ambiguities and using commonsense knowledge to help resolve them.


Sign in / Sign up

Export Citation Format

Share Document