The Suboptimality of Early Exercise of Futures-Style Options: A Model-Free Result, Robust to Market Imperfections and Performance Bond Requirements

Author(s):  
Rodolfo Oviedo
2021 ◽  
Author(s):  
Wenlin Dai ◽  
Stavros Athanasiadis ◽  
Tomáš Mrkvička

Clustering is an essential task in functional data analysis. In this study, we propose a framework for a clustering procedure based on functional rankings or depth. Our methods naturally combine various types of between-cluster variation equally, which caters to various discriminative sources of functional data; for example, they combine raw data with transformed data or various components of multivariate functional data with their covariance. Our methods also enhance the clustering results with a visualization tool that allows intrinsic graphical interpretation. Finally, our methods are model-free and nonparametric and hence are robust to heavy-tailed distribution or potential outliers. The implementation and performance of the proposed methods are illustrated with a simulation study and applied to three real-world applications.


Author(s):  
Yinlam Chow ◽  
Brandon Cui ◽  
Moonkyung Ryu ◽  
Mohammad Ghavamzadeh

Model-based reinforcement learning (RL) algorithms allow us to combine model-generated data with those collected from interaction with the real system in order to alleviate the data efficiency problem in RL. However, designing such algorithms is often challenging because the bias in simulated data may overshadow the ease of data generation. A potential solution to this challenge is to jointly learn and improve model and policy using a universal objective function. In this paper, we leverage the connection between RL and probabilistic inference, and formulate such an objective function as a variational lower-bound of a log-likelihood. This allows us to use expectation maximization (EM) and iteratively fix a baseline policy and learn a variational distribution, consisting of a model and a policy (E-step), followed by improving the baseline policy given the learned variational distribution (M-step). We propose model-based and model-free policy iteration (actor-critic) style algorithms for the E-step and show how the variational distribution learned by them can be used to optimize the M-step in a fully model-based fashion. Our experiments on a number of continuous control tasks show that our model-based (E-step) algorithm, called variational model-based policy optimization (VMBPO), is more sample-efficient and robust to hyper-parameter tuning than its model-free (E-step) counterpart. Using the same control tasks, we also compare VMBPO with several state-of-the-art model-based and model-free RL algorithms and show its sample efficiency and performance.


2014 ◽  
Vol 02 (01) ◽  
pp. 39-52 ◽  
Author(s):  
Iman Sadeghzadeh ◽  
Mahyar Abdolhosseini ◽  
Youmin Zhang

Two useful control techniques are investigated and applied experimentally to an unmanned quadrotor helicopter for a practical and important scenario of using an Unmanned Aerial Vehicle (UAV) for dropping a payload in circumstances where search and rescue and delivery of supplies and goods is dangerous and difficult to reach environments such as forest or high building fires fighting, rescue in earthquake, flood and nuclear disaster situations. The two considered control techniques for such applications are the Gain-Scheduled Proportional-Integral-Derivative (GS-PID) control and the Model Predictive Control (MPC). Both the model-free (GS-PID) and model-based (MPC) algorithms show a very promising performance with application to taking-off, height holding, payload dropping, and landing periods in a payload dropping mission. Finally, both algorithms are successfully implemented on an unmanned quadrotor helicopter testbed (known as Qball-X4) available at the Networked Autonomous Vehicles Lab (NAVL) of Concordia University for payload dropping tests to illustrate the effectiveness and performance comparison of the two control techniques.


Author(s):  
Xi Nowak ◽  
Dirk Söffker

This contribution considers a new realization of the cognitive stabilizer, which is an adaptive stabilization control method based on a cognition-based framework. It is assumed, that the model of the system to be controlled is unknown. Only the knowledge about the system inputs, outputs, and equilibrium points are the preliminaries assumed within this approach. A new improved realization of the cognitive stabilizer is designed in this contribution using 1) a neural network estimating suitable inputs according to the desired outputs, 2) Lyapunov stability criterion according to a certain Lyapunov function, and 3) an optimization method to determine the desired system outputs with respect to the system energy. The proposed cognitive stabilizer is able to stabilize an unknown nonlinear MIMO system at arbitrary equilibrium point of it. Suitable control input can be designed automatically to guarantee the stability of motion of the system during the whole process although the changing of the system behavior or the environment. Numerical examples are shown to demonstrate the successful application and performance of this method.


2019 ◽  
Author(s):  
Noah Zarr ◽  
Joshua W. Brown

AbstractThe question of how animals and humans can solve arbitrary problems and achieve arbitrary goals remains open. Model-based and model-free reinforcement learning methods have addressed these problems, but they generally lack the ability to flexibly reassign reward value to various states as the reward structure of the environment changes. Research on cognitive control has generally focused on inhibition, rule-guided behavior, and performance monitoring, with relatively less focus on goal representations. From the engineering literature, control theory suggests a solution in that an animal can be seen as trying to minimize the difference between the actual and desired states of the world, and the Dijkstra algorithm further suggests a conceptual framework for moving a system toward a goal state. He we present a purely localist neural network model that can autonomously learn the structure of an environment and then achieve any arbitrary goal state in a changing environment without re-learning reward values. The model clarifies a number of issues inherent in biological constraints on such a system, including the essential role of oscillations in learning and performance. We demonstrate that the model can efficiently learn to solve arbitrary problems, including for example the Tower of Hanoi problem.


Author(s):  
Elmira Madadi ◽  
Dirk Söffker

This contribution considers a model-free control method based on an optimal iterative learning control framework to design a suitable controller. Using this framework, the controller requires neither the information about the systems dynamical structure nor the knowledge about system physical behaviors. The task is solved using only the system outputs and inputs, which are assumed as measurable. The structure of the proposed method consists of three parts. The first part is implemented through an intelligent PID controller on the system. In the second part, a robust second order differentiator via sliding mode is applied in order to estimate accurately the evolution of the state function. In the third part, an optimal iterative learning control is chosen to improve the performance. Numerical examples are shown to demonstrate the successful application and performance of the method.


Sign in / Sign up

Export Citation Format

Share Document