An Extension of the Rational Policy Making algorithm to Continuous State Spaces

The penalty avoiding rational policy making algorithm (PARP) [1] previously improved to save memory and cope with uncertainty, i.e., IPARP [2], requires that states be discretized in real environments with continuous state spaces, using function approximation or some other method. Especially, in PARP, a method that discretizes state using a basis functions is known [3]. Because this creates a new basis function based on the current input and its next observation, however, an unsuitable basis function may be generated in some asynchronous multiagent environments. We therefore propose a uniform basis function and range extent of the basis function is estimated before learning. We show the effectiveness of our proposal using a soccer game task called “Keepaway.”

Download Full-text

Reinforcement Learning for Penalty Avoidance in Continuous State Spaces

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2007.p0668 ◽

2007 ◽

Vol 11 (6) ◽

pp. 668-676 ◽

Cited By ~ 9

Author(s):

Kazuteru Miyazaki ◽

◽

Shigenobu Kobayashi ◽

Keyword(s):

Reinforcement Learning ◽

Policy Making ◽

Profit Sharing ◽

State Spaces ◽

Continuous State

Reinforcement learning involves learning to adapt to environments through the presentation of rewards – special input &#8211 serving as clues. To obtain quick rational policies, profit sharing (PS) [6], rational policy making algorithm (RPM) [7], penalty avoiding rational policy making algorithm (PARP) [8], and PS-r* [9] are used. They are called PS-based methods. When applying reinforcement learning to actual problems, treatment of continuous-valued input is sometimes required. A method [10] based on RPM is proposed as a PS-based method corresponding to the continuous-valued input, but only rewards exist and penalties cannot be suitably handled. We studied the treatment of continuous-valued input suitable for a PS-based method in which the environment includes both rewards and penalties. Specifically, we propose having PARP correspond to continuous-valued input while simultaneously targeting the attainment of rewards and avoiding penalties. We applied our proposal to the pole-cart balancing problem and confirmed its validity.

Download Full-text

Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals

2011 IEEE/RSJ International Conference on Intelligent Robots and Systems ◽

10.1109/iros.2011.6048804 ◽

2011 ◽

Cited By ~ 1

Author(s):

N. Aghasadeghi ◽

T. Bretl

Keyword(s):

Reinforcement Learning ◽

Maximum Entropy ◽

Path Integrals ◽

Inverse Reinforcement Learning ◽

State Spaces ◽

Continuous State

Download Full-text

State Aggregation by Growing Neural Gas for Reinforcement Learning in Continuous State Spaces

2011 10th International Conference on Machine Learning and Applications and Workshops ◽

10.1109/icmla.2011.134 ◽

2011 ◽

Cited By ~ 7

Author(s):

Michael Baumann ◽

Hans Kleine Buning

Keyword(s):

Reinforcement Learning ◽

Growing Neural Gas ◽

State Spaces ◽

Neural Gas ◽

State Aggregation ◽

Continuous State

Download Full-text

Stable running of a planar underactuated biped robot

Robotica ◽

10.1017/s0263574710000512 ◽

2010 ◽

Vol 29 (5) ◽

pp. 657-665 ◽

Cited By ~ 14

Author(s):

Yong Hu ◽

Gangfeng Yan ◽

Zhiyun Lin

Keyword(s):

Limit Cycle ◽

Stance Phase ◽

Control Strategies ◽

Basin Of Attraction ◽

Biped Robot ◽

State Spaces ◽

Continuous State ◽

Event Based ◽

Telescopic Legs ◽

The Basin Of Attraction

SUMMARYThis paper investigates the stable-running problem of a planar underactuated biped robot, which has two springy telescopic legs and one actuated joint in the hip. After modeling the robot as a hybrid system with multiple continuous state spaces, a natural passive limit cycle, which preserves the system energy at touchdown, is found using the method of Poincaré shooting. It is then checked that the passive limit cycle is not stable. To stabilize the passive limit cycle, an event-based feedback control law is proposed, and also to enlarge the basin of attraction, an additive passivity-based control term is introduced only in the stance phase. The validity of our control strategies is illustrated by a series of numerical simulations.

Download Full-text

Adaptive importance sampling of random walks on continuous state spaces

10.2172/677157 ◽

1998 ◽

Cited By ~ 1

Author(s):

K. Baggerly ◽

D. Cox ◽

R. Picard

Keyword(s):

Random Walks ◽

Importance Sampling ◽

State Spaces ◽

Adaptive Importance Sampling ◽

Continuous State

Download Full-text

COVID-19: A Data-Driven Mean-Field-Type Game Perspective

10.1101/2020.07.23.20160853 ◽

2020 ◽

Author(s):

Hamidou Tembine

Keyword(s):

Mean Field ◽

Decision Makers ◽

Data Driven ◽

Feedback Form ◽

State Spaces ◽

Hospital Capacity ◽

Field Type ◽

Adjoint Systems ◽

Continuous State ◽

Reported Data

In this article, a class of mean-field-type games with discrete-continuous state spaces is considered. We establish Bellman systems which provide sufficiency conditions for mean-field-type equilibria in state-and-mean-field-type feedback form. We then derive unnormalized master adjoint systems (MASS). The methodology is shown to be flexible enough to capture multi-class interaction in epidemic propagation in which multiple authorities are risk-aware atomic decision-makers and individuals are risk-aware non-atomic decision-makers. Based on MASS, we present a data-driven modelling and analytics for mitigating Coronavirus Disease 2019 (COVID-19). The model integrates untested cases, age-structure, decision-making, gender, pre-existing health conditions, location, testing capacity, hospital capacity, mobility map on local areas, in-city, inter-cities, and international. It shown that the data-driven model can capture most of the reported data on COVID-19 on confirmed cases, deaths, recovered, number of testing and number of active cases in 66+ countries. The model also reports non-Gaussianity and non-exponential properties in 15+ countries.

Download Full-text

Seven easy lessons introducing non-equilibrium statistical mechanics and (bio)physical chemistry

10.31219/osf.io/98ypj ◽

2020 ◽

Author(s):

Daniel Zuckerman

Keyword(s):

State Spaces ◽

Chemical Systems ◽

Markov State Models ◽

Continuous State ◽

Modern Physics ◽

Three State Model ◽

State Models ◽

Introductory Discussion ◽

The Relationship ◽

Non Equilibrium

The non-equilibrium principles and methods which are fundamental to understanding modern physics, biology, and chemistry can be intimidating. The key to learning them is working through simple, concrete examples. The beginner must do calculations by hand to truly grasp how the equations and ideas work together. Hence, seven sets of homework problems are given -- using two and three-state model chemical systems -- along with introductory discussion, complete solutions, and links to further discussion. The problems use simple math and graphics to introduce equilibrium, relaxation, non-equilibrium steady states and the associated timescales, using discrete and continuous state spaces as well as continuous and discrete time (``Markov state models''). The relationship between these descriptions is explained. The overall aim is to provide a set of easily approachable material that will significantly increase the student's confidence in approaching more advanced material.

Download Full-text

COVID-19: Data-Driven Mean-Field-Type Game Perspective

Games ◽

10.3390/g11040051 ◽

2020 ◽

Vol 11 (4) ◽

pp. 51

Author(s):

Hamidou Tembine

Keyword(s):

Mean Field ◽

Decision Makers ◽

Data Driven ◽

Feedback Form ◽

State Spaces ◽

Field Type ◽

Adjoint Systems ◽

Continuous State ◽

Non Gaussian ◽

Reported Data

In this article, a class of mean-field-type games with discrete-continuous state spaces is considered. We establish Bellman systems which provide sufficiency conditions for mean-field-type equilibria in state-and-mean-field-type feedback form. We then derive unnormalized master adjoint systems (MASS). The methodology is shown to be flexible enough to capture multi-class interaction in epidemic propagation in which multiple authorities are risk-aware atomic decision-makers and individuals are risk-aware non-atomic decision-makers. Based on MASS, we present a data-driven modelling and analytics for mitigating Coronavirus Disease 2019 (COVID-19). The model integrates untested cases, age-structure, decision-making, gender, pre-existing health conditions, location, testing capacity, hospital capacity, and a mobility map of local areas, including in-cities, inter-cities, and internationally. It is shown that the data-driven model can capture most of the reported data on COVID-19 on confirmed cases, deaths, recovered, number of testing and number of active cases in 66+ countries. The model also reports non-Gaussian and non-exponential properties in 15+ countries.

Download Full-text

A Reinforcement Learning Algorithm for Continuous State Spaces using Multiple Fuzzy-ART Networks

2006 SICE-ICASE International Joint Conference ◽

10.1109/sice.2006.315140 ◽

2006 ◽

Cited By ~ 6

Author(s):

Takeshi Tateyama ◽

Seiichi Kawata ◽

Yoshiki Shimomura

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

State Spaces ◽

Fuzzy Art ◽

Continuous State ◽

Reinforcement Learning Algorithm

Download Full-text