Assumed Density Filtering Q-learning

While off-policy temporal difference (TD) methods have widely been used in reinforcement learning due to their efficiency and simple implementation, their Bayesian counterparts have not been utilized as frequently. One reason is that the non-linear max operation in the Bellman optimality equation makes it difficult to define conjugate distributions over the value functions. In this paper, we introduce a novel Bayesian approach to off-policy TD methods, called as ADFQ, which updates beliefs on state-action values, Q, through an online Bayesian inference method known as Assumed Density Filtering. We formulate an efficient closed-form solution for the value update by approximately estimating analytic parameters of the posterior of the Q-beliefs. Uncertainty measures in the beliefs not only are used in exploration but also provide a natural regularization for the value update considering all next available actions. ADFQ converges to Q-learning as the uncertainty measures of the Q-beliefs decrease and improves common drawbacks of other Bayesian RL algorithms such as computational complexity. We extend ADFQ with a neural network. Our empirical results demonstrate that ADFQ outperforms comparable algorithms on various Atari 2600 games, with drastic improvements in highly stochastic domains or domains with a large action space.

Download Full-text

MULTI-ASSET PORTFOLIO OPTIMIZATION WITH STOCHASTIC SHARPE RATIO UNDER DRAWDOWN CONSTRAINT

Annals of Financial Economics ◽

10.1142/s2010495220800019 ◽

2020 ◽

Vol 15 (01) ◽

pp. 2080001

Author(s):

SUBHOJIT BISWAS ◽

SAIF JAWAID ◽

DIGANTA MUKHERJEE

Keyword(s):

Stochastic Volatility ◽

Closed Form Solution ◽

Fixed Time ◽

Form Solution ◽

Risk Tolerance ◽

Finite Difference Schemes ◽

Value Functions ◽

Pairs Trading ◽

Risky Assets ◽

Drawdown Constraint

We consider an investor who seeks to maximize his expected utility of the portfolio, consisting of multiple risky assets and one risk-free asset, derived from the terminal wealth relative to the maximum wealth achieved over a fixed time horizon. This is achieved under a portfolio draw down constraint, in a market with local stochastic volatility. In empirical application, considering two risky assets, the assets have been identified with the help of pairs trading. In the absence of closed form solution of the value function and the optimal strategy, we obtain the approximates of these quantities using coefficient series expansion techniques and finite difference schemes. We utilize the risk tolerance factor function to ease our approximations of this value functions and the strategies. All the parameters were estimated from the triplets and used to illustrate and compare the stochastic volatility with the constant volatility situation, and how an investor can deploy different portfolio plans.

Download Full-text

An efficient closed-form solution for wideband source direction-of-arrival estimation

10.32469/10355/44737 ◽

2013 ◽

Author(s):

Wenjia Shi

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Direction Of Arrival ◽

Form Solution ◽

Direction Of Arrival Estimation

Download Full-text

On a Closed-Form Solution of Prandtl's System of Equations

International Journal of Fluid Mechanics Research ◽

10.1615/interjfluidmechres.v40.i2.20 ◽

2013 ◽

Vol 40 (2) ◽

pp. 106-114

Author(s):

J. Venetis ◽

Aimilios (Preferred name Emilios) Sideridis

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Form Solution ◽

System Of Equations

Download Full-text

Plane Wave Resonance in the Tire Air Cavity as a Vehicle Interior Noise Source

Tire Science and Technology ◽

10.2346/1.2137495 ◽

1995 ◽

Vol 23 (1) ◽

pp. 2-10 ◽

Cited By ~ 23

Author(s):

J. K. Thompson

Keyword(s):

Closed Form Solution ◽

Form Solution ◽

Interior Noise ◽

Cavity Resonance ◽

Test Results ◽

Vehicle Interior ◽

Air Cavity ◽

Vehicle Interior Noise ◽

Wave Resonance ◽

Resonance Frequencies

Abstract Vehicle interior noise is the result of numerous sources of excitation. One source involving tire pavement interaction is the tire air cavity resonance and the forcing it provides to the vehicle spindle: This paper applies fundamental principles combined with experimental verification to describe the tire cavity resonance. A closed form solution is developed to predict the resonance frequencies from geometric data. Tire test results are used to examine the accuracy of predictions of undeflected and deflected tire resonances. Errors in predicted and actual frequencies are shown to be less than 2%. The nature of the forcing this resonance as it applies to the vehicle spindle is also examined.

Download Full-text

A Closed-Form Solution to Multi-Point Scheduling Problems

AIAA Modeling and Simulation Technologies Conference ◽

10.2514/6.2010-7911 ◽

2010 ◽

Cited By ~ 6

Author(s):

Larry Meyn

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Form Solution ◽

Scheduling Problems

Download Full-text

Capacity Analysis for Correlated Multi-Hop MIMO Channels under Colored Noise

Journal of Science and Technology Issue on Information and Communications Technology ◽

10.31130/jst.2015.10 ◽

2015 ◽

Vol 1 ◽

pp. 41

Author(s):

Nguyen N. Tran ◽

Ha X. Nguyen

Keyword(s):

Computational Complexity ◽

Mutual Information ◽

Closed Form Solution ◽

Optimal Solution ◽

Form Solution ◽

Maximization Problem ◽

Capacity Analysis ◽

Mimo Channels ◽

Average Mutual Information ◽

Channel Output

A capacity analysis for generally correlated wireless multi-hop multi-input multi-output (MIMO) channels is presented in this paper. The channel at each hop is spatially correlated, the source symbols are mutually correlated, and the additive Gaussian noises are colored. First, by invoking Karush-Kuhn-Tucker condition for the optimality of convex programming, we derive the optimal source symbol covariance for the maximum mutual information between the channel input and the channel output when having the full knowledge of channel at the transmitter. Secondly, we formulate the average mutual information maximization problem when having only the channel statistics at the transmitter. Since this problem is almost impossible to be solved analytically, the numerical interior-point-method is employed to obtain the optimal solution. Furthermore, to reduce the computational complexity, an asymptotic closed-form solution is derived by maximizing an upper bound of the objective function. Simulation results show that the average mutual information obtained by the asymptotic design is very closed to that obtained by the optimal design, while saving a huge computational complexity.

Download Full-text

A Simple Closed-Form Solution of Bending Stiffness for Laminated Composite Tubes

Journal of Reinforced Plastics and Composites ◽

10.1106/y1h1-25tk-1m0j-wj2x ◽

2000 ◽

Vol 19 (4) ◽

pp. 278-291 ◽

Cited By ~ 4

Author(s):

WEN S. CHAN ◽

KAZIM C. DEMIRHAN

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Laminated Composite ◽

Bending Stiffness ◽

Form Solution ◽

Composite Tubes

Download Full-text

Geometric Average Asian Option Pricing with Paying Dividend Yield under Non-Extensive Statistical Mechanics for Time-Varying Model

Entropy ◽

10.3390/e20110828 ◽

2018 ◽

Vol 20 (11) ◽

pp. 828 ◽

Cited By ~ 3

Author(s):

Jixia Wang ◽

Yameng Zhang

Keyword(s):

Statistical Mechanics ◽

Option Pricing ◽

Asset Price ◽

Closed Form Solution ◽

Real Data ◽

Form Solution ◽

Asian Option ◽

Time Varying ◽

Dividend Yield ◽

Geometric Average

This paper is dedicated to the study of the geometric average Asian call option pricing under non-extensive statistical mechanics for a time-varying coefficient diffusion model. We employed the non-extensive Tsallis entropy distribution, which can describe the leptokurtosis and fat-tail characteristics of returns, to model the motion of the underlying asset price. Considering that economic variables change over time, we allowed the drift and diffusion terms in our model to be time-varying functions. We used the I t o ^ formula, Feynman–Kac formula, and P a d e ´ ansatz to obtain a closed-form solution of geometric average Asian option pricing with a paying dividend yield for a time-varying model. Moreover, the simulation study shows that the results obtained by our method fit the simulation data better than that of Zhao et al. From the analysis of real data, we identify the best value for q which can fit the real stock data, and the result shows that investors underestimate the risk using the Black–Scholes model compared to our model.

Download Full-text

Closed-form solution of repeat ground track orbit design and constellation deployment strategy

Acta Astronautica ◽

10.1016/j.actaastro.2020.12.021 ◽

2020 ◽

Author(s):

Soung Sub Lee

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Form Solution ◽

Orbit Design ◽

Ground Track ◽

Deployment Strategy

Download Full-text

A Closed-Form Solution to Planar Feature-Based Registration of LiDAR Point Clouds

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070435 ◽

2021 ◽

Vol 10 (7) ◽

pp. 435

Author(s):

Yongbo Wang ◽

Nanshan Zheng ◽

Zhengfu Bian

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Simulated Data ◽

Real Data ◽

Point Clouds ◽

Form Solution ◽

Spatial Transformation ◽

Dual Quaternions ◽

Feature Based ◽

Planar Feature

Since pairwise registration is a necessary step for the seamless fusion of point clouds from neighboring stations, a closed-form solution to planar feature-based registration of LiDAR (Light Detection and Ranging) point clouds is proposed in this paper. Based on the Plücker coordinate-based representation of linear features in three-dimensional space, a quad tuple-based representation of planar features is introduced, which makes it possible to directly determine the difference between any two planar features. Dual quaternions are employed to represent spatial transformation and operations between dual quaternions and the quad tuple-based representation of planar features are given, with which an error norm is constructed. Based on L2-norm-minimization, detailed derivations of the proposed solution are explained step by step. Two experiments were designed in which simulated data and real data were both used to verify the correctness and the feasibility of the proposed solution. With the simulated data, the calculated registration results were consistent with the pre-established parameters, which verifies the correctness of the presented solution. With the real data, the calculated registration results were consistent with the results calculated by iterative methods. Conclusions can be drawn from the two experiments: (1) The proposed solution does not require any initial estimates of the unknown parameters in advance, which assures the stability and robustness of the solution; (2) Using dual quaternions to represent spatial transformation greatly reduces the additional constraints in the estimation process.

Download Full-text