Robo-Advising: Learning Investors’ Risk Preferences via Portfolio Choices*

Journal of Financial Econometrics ◽

10.1093/jjfinec/nbz040 ◽

2020 ◽

Author(s):

Humoud Alsabah ◽

Agostino Capponi ◽

Octavio Ruiz Lacedelli ◽

Matt Stern

Keyword(s):

Opportunity Cost ◽

Value Function ◽

Risk Preference ◽

Portfolio Decisions ◽

Learning Framework ◽

Portfolio Choices ◽

Trading Decisions ◽

Exploration Exploitation ◽

The Value Function ◽

Over Time

Abstract We introduce a reinforcement learning framework for retail robo-advising. The robo-advisor does not know the investor’s risk preference but learns it over time by observing her portfolio choices in different market environments. We develop an exploration–exploitation algorithm that trades off costly solicitations of portfolio choices by the investor with autonomous trading decisions based on stale estimates of investor’s risk aversion. We show that the approximate value function constructed by the algorithm converges to the value function of an omniscient robo-advisor over a number of periods that is polynomial in the state and action space. By correcting for the investor’s mistakes, the robo-advisor may outperform a stand-alone investor, regardless of the investor’s opportunity cost for making portfolio decisions.

Download Full-text

Surveys without Questions: A Reinforcement Learning Approach

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301257 ◽

2019 ◽

Vol 33 ◽

pp. 257-264 ◽

Cited By ~ 1

Author(s):

Atanu R Sinha ◽

Deepali Jain ◽

Nikhil Sheoran ◽

Sopan Khosla ◽

Reshmi Sasidharan

Keyword(s):

Reinforcement Learning ◽

Survey Data ◽

Value Function ◽

Specific Interactions ◽

Aggregate Level ◽

Clickstream Data ◽

Online Interactions ◽

Performance Results ◽

The Value Function ◽

Over Time

The ‘old world’ instrument, survey, remains a tool of choice for firms to obtain ratings of satisfaction and experience that customers realize while interacting online with firms. While avenues for survey have evolved from emails and links to pop-ups while browsing, the deficiencies persist. These include - reliance on ratings of very few respondents to infer about all customers’ online interactions; failing to capture a customer’s interactions over time since the rating is a one-time snapshot; and inability to tie back customers’ ratings to specific interactions because ratings provided relate to all interactions. To overcome these deficiencies we extract proxy ratings from clickstream data, typically collected for every customer’s online interactions, by developing an approach based on Reinforcement Learning (RL). We introduce a new way to interpret values generated by the value function of RL, as proxy ratings. Our approach does not need any survey data for training. Yet, on validation against actual survey data, proxy ratings yield reasonable performance results. Additionally, we offer a new way to draw insights from values of the value function, which allow associating specific interactions to their proxy ratings. We introduce two new metrics to represent ratings - one, customer-level and the other, aggregate-level for click actions across customers. Both are defined around proportion of all pairwise, successive actions that show increase in proxy ratings. This intuitive customer-level metric enables gauging the dynamics of ratings over time and is a better predictor of purchase than customer ratings from survey. The aggregate-level metric allows pinpointing actions that help or hurt experience. In sum, proxy ratings computed unobtrusively from clickstream, for every action, for each customer, and for every session can offer interpretable and more insightful alternative to surveys.

Download Full-text

The value function for time-related decisions

PsycEXTRA Dataset ◽

10.1037/e653632011-006 ◽

2011 ◽

Author(s):

Anouk Festjens ◽

Siegfried Dewitte ◽

Enrico Diecidue ◽

Sabrina Bruyneel

Keyword(s):

Value Function ◽

The Value Function

Download Full-text

The Equal Tails: A Method to Elicit the Value Function

SSRN Electronic Journal ◽

10.2139/ssrn.893748 ◽

2006 ◽

Author(s):

Manel Baucells ◽

Antonio Villasis

Keyword(s):

Value Function ◽

The Value Function

Download Full-text

Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network

Alexandria Engineering Journal ◽

10.1016/j.aej.2021.01.030 ◽

2021 ◽

Vol 60 (3) ◽

pp. 2787-2800

Author(s):

Jianfeng Ren ◽

Chunming Ye ◽

Feng Yang

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Value Function ◽

Flow Shop ◽

Learning Algorithm ◽

Flow Shop Scheduling ◽

Scheduling Problem ◽

Shop Scheduling ◽

The Value Function ◽

Reinforcement Learning Algorithm

Download Full-text

Pricing Perpetual American Put Options with Asset-Dependent Discounting

Journal of Risk and Financial Management ◽

10.3390/jrfm14030130 ◽

2021 ◽

Vol 14 (3) ◽

pp. 130

Author(s):

Jonas Al-Hadad ◽

Zbigniew Palmowski

Keyword(s):

Value Function ◽

Asset Price ◽

Stopping Times ◽

Martingale Measure ◽

Put Options ◽

American Put Options ◽

Exact Calculations ◽

Negative Exponential ◽

American Put ◽

The Value Function

The main objective of this paper is to present an algorithm of pricing perpetual American put options with asset-dependent discounting. The value function of such an instrument can be described as VAPutω(s)=supτ∈TEs[e−∫0τω(Sw)dw(K−Sτ)+], where T is a family of stopping times, ω is a discount function and E is an expectation taken with respect to a martingale measure. Moreover, we assume that the asset price process St is a geometric Lévy process with negative exponential jumps, i.e., St=seζt+σBt−∑i=1NtYi. The asset-dependent discounting is reflected in the ω function, so this approach is a generalisation of the classic case when ω is constant. It turns out that under certain conditions on the ω function, the value function VAPutω(s) is convex and can be represented in a closed form. We provide an option pricing algorithm in this scenario and we present exact calculations for the particular choices of ω such that VAPutω(s) takes a simplified form.

Download Full-text

Metric for Disassembly and Reuse Decisions: Formulation and Validation

Volume 5: 13th Design for Manufacturability and the Lifecycle Conference; 5th Symposium on International Design and Design Education; 10th International Conference on Advanced Vehicle and Tire Technologies ◽

10.1115/detc2008-49878 ◽

2008 ◽

Cited By ~ 1

Author(s):

Vijitashwa Pandey ◽

Deborah Thurston

Keyword(s):

Maximum Likelihood ◽

Value Function ◽

Choice Theory ◽

Decision Makers ◽

Maximum Likelihood Estimates ◽

Likelihood Method ◽

Maximum Value ◽

Long Range Planning ◽

The Relationship ◽

The Value Function

Design for disassembly and reuse focuses on developing methods to minimize difficulty in disassembly for maintenance or reuse. These methods can gain substantially if the relationship between component attributes (material mix, ease of disassembly etc.) and their likelihood of reuse or disposal is understood. For products already in the marketplace, a feedback approach that evaluates willingness of manufacturers or customers (decision makers) to reuse a component can reveal how attributes of a component affect reuse decisions. This paper introduces some metrics and combines them with ones proposed in literature into a measure that captures the overall value of a decision made by the decision makers. The premise is that the decision makers would choose a decision that has the maximum value. Four decisions are considered regarding a component’s fate after recovery ranging from direct reuse to disposal. A method on the lines of discrete choice theory is utilized that uses maximum likelihood estimates to determine the parameters that define the value function. The maximum likelihood method can take inputs from actual decisions made by the decision makers to assess the value function. This function can be used to determine the likelihood that the component takes a certain path (one of the four decisions), taking as input its attributes, which can facilitate long range planning and also help determine ways reuse decisions can be influenced.

Download Full-text

The value function in ergodic control of diffusion processes with partial observations

Stochastics and Stochastics Reports ◽

10.1080/17442509908834213 ◽

1999 ◽

Vol 67 (3-4) ◽

pp. 255-266 ◽

Cited By ~ 4

Author(s):

V. S. Borkar

Keyword(s):

Value Function ◽

Diffusion Processes ◽

Ergodic Control ◽

Partial Observations ◽

The Value Function

Download Full-text

Regularity of the Value Function for a Two-Dimensional Singular Stochastic Control Problem

SIAM Journal on Control and Optimization ◽

10.1137/0327047 ◽

1989 ◽

Vol 27 (4) ◽

pp. 876-907 ◽

Cited By ~ 66

Author(s):

H. Mete Soner ◽

Shreve E. Shreve

Keyword(s):

Control Problem ◽

Stochastic Control ◽

Value Function ◽

Two Dimensional ◽

Singular Stochastic Control ◽

Stochastic Control Problem ◽

The Value Function

Download Full-text

Directional derivatives for the value-function in semi-infinite programming

Mathematical Programming ◽

10.1007/bf02592018 ◽

1987 ◽

Vol 38 (3) ◽

pp. 323-340 ◽

Cited By ~ 23

Author(s):

P. Zencke ◽

R. Hettich

Keyword(s):

Value Function ◽

Directional Derivatives ◽

Infinite Programming ◽

The Value Function

Download Full-text

MORDUKHOVICH SUBGRADIENTS OF THE VALUE FUNCTION IN A PARAMETRIC OPTIMAL CONTROL PROBLEM

Taiwanese Journal of Mathematics ◽

10.11650/tjm.19.2015.3635 ◽

2015 ◽

Vol 19 (4) ◽

pp. 1051-1072 ◽

Cited By ~ 4

Author(s):

Nguyen Toan

Keyword(s):

Optimal Control ◽

Optimal Control Problem ◽

Control Problem ◽

Value Function ◽

Parametric Optimal Control ◽

The Value Function

Download Full-text