scholarly journals A New Improved Penalty Avoiding Rational Policy Making Algorithm for Keepaway with Continuous State Spaces

Author(s):  
Takuji Watanabe ◽  
◽  
Kazuteru Miyazaki ◽  
Hiroaki Kobayashi ◽  
◽  
...  

The penalty avoiding rational policy making algorithm (PARP) [1] previously improved to save memory and cope with uncertainty, i.e., IPARP [2], requires that states be discretized in real environments with continuous state spaces, using function approximation or some other method. Especially, in PARP, a method that discretizes state using a basis functions is known [3]. Because this creates a new basis function based on the current input and its next observation, however, an unsuitable basis function may be generated in some asynchronous multiagent environments. We therefore propose a uniform basis function and range extent of the basis function is estimated before learning. We show the effectiveness of our proposal using a soccer game task called “Keepaway.”

Author(s):  
Kazuteru Miyazaki ◽  
Hajime Kimura ◽  
Shigenobu Kobayashi

1996 ◽  
Vol 07 (02) ◽  
pp. 167-179 ◽  
Author(s):  
ROBERT SHORTEN ◽  
RODERICK MURRAY-SMITH

Normalisation of the basis function activations in a Radial Basis Function (RBF) network is a common way of achieving the partition of unity often desired for modelling applications. It results in the basis functions covering the whole of the input space to the same degree. However, normalisation of the basis functions can lead to other effects which are sometimes less desirable for modelling applications. This paper describes some side effects of normalisation which fundamentally alter properties of the basis functions, e.g. the shape is no longer uniform, maxima of basis functions can be shifted from their centres, and the basis functions are no longer guaranteed to decrease monotonically as distance from their centre increases—in many cases basis functions can ‘reactivate’, i.e. re-appear far from the basis function centre. This paper examines how these phenomena occur, discusses their relevance for non-linear function approximation and examines the effect of normalisation on the network condition number and weights.


Author(s):  
Kazuteru Miyazaki ◽  
◽  
Shigenobu Kobayashi ◽  

Reinforcement learning involves learning to adapt to environments through the presentation of rewards – special input &#8211 serving as clues. To obtain quick rational policies, profit sharing (PS) [6], rational policy making algorithm (RPM) [7], penalty avoiding rational policy making algorithm (PARP) [8], and PS-r* [9] are used. They are called PS-based methods. When applying reinforcement learning to actual problems, treatment of continuous-valued input is sometimes required. A method [10] based on RPM is proposed as a PS-based method corresponding to the continuous-valued input, but only rewards exist and penalties cannot be suitably handled. We studied the treatment of continuous-valued input suitable for a PS-based method in which the environment includes both rewards and penalties. Specifically, we propose having PARP correspond to continuous-valued input while simultaneously targeting the attainment of rewards and avoiding penalties. We applied our proposal to the pole-cart balancing problem and confirmed its validity.


Heat Transfer ◽  
2021 ◽  
Author(s):  
Maryam Fallah Najafabadi ◽  
Hossein Talebi Rostami ◽  
Khashayar Hosseinzadeh ◽  
Davood Domiri Ganji

2014 ◽  
Vol 986-987 ◽  
pp. 1418-1421
Author(s):  
Jun Shan Li

In this paper, we propose a meshless method for solving the mathematical model concerning the leakage problem when the pressure is tested in the gas pipeline. The method of radial basis function (RBF) can be used for solving partial differential equation by writing the solution in the form of linear combination of radius basis functions, that is, when integrating the definite conditions, one can find the combination coefficients and then the numerical solution. The leak problem is a kind of inverse problem that is focused by many engineers or mathematical researchers. The strength of the leak can find easily by the additional conditions and the numerical solutions.


2014 ◽  
Vol 2014 ◽  
pp. 1-5
Author(s):  
Guohua Wang ◽  
Yufa Sun

A broadband radar cross section (RCS) calculation approach is proposed based on the characteristic basis function method (CBFM). In the proposed approach, the desired arbitrary frequency band is adaptively divided into multiple subband in consideration of the characteristic basis functions (CBFs) number, which can reduce the universal characteristic basis functions (UCBFs) numbers after singular value decomposition (SVD) procedure at lower subfrequency band. Then, the desired RCS data can be obtained by splicing the RCS data in each subfrequency band. Numerical results demonstrate that the proposed method achieve a high accuracy and efficiency over a wide frequency range.


2020 ◽  
Vol 20 (4) ◽  
pp. 60-83
Author(s):  
Vinícius Magalhães Pinto Marques ◽  
Gisele Tessari Santos ◽  
Mauri Fortes

ABSTRACTObjective: This article aims to solve the non-linear Black Scholes (BS) equation for European call options using Radial Basis Function (RBF) Multi-Quadratic (MQ) Method.Methodology / Approach: This work uses the MQ RBF method applied to the solution of two complex models of nonlinear BS equation for prices of European call options with modified volatility. Linear BS models are also solved to visualize the effects of modified volatility.  Additionally, an adaptive scheme is implemented in time based on the Runge-Kutta-Fehlberg (RKF) method.


2008 ◽  
Vol 5 (1) ◽  
pp. 143-148 ◽  
Author(s):  
Baghdad Science Journal

A method for Approximated evaluation of linear functional differential equations is described. where a function approximation as a linear combination of a set of orthogonal basis functions which are chebyshev functions .The coefficients of the approximation are determined by (least square and Galerkin’s) methods. The property of chebyshev polynomials leads to good results , which are demonstrated with examples.


Sign in / Sign up

Export Citation Format

Share Document