Non-Markovian Reinforcement-Based on Self-Optimizing Memory Controller

Volume 3: ASME/IEEE 2009 International Conference on Mechatronic and Embedded Systems and Applications; 20th Reliability, Stress Analysis, and Failure Prevention Conference ◽

10.1115/detc2009-86326 ◽

2009 ◽

Author(s):

Hassab Elgawi Osman

Keyword(s):

Value Function ◽

Learning Capability ◽

Memory Controller ◽

State Action ◽

Proposed Model ◽

On Line ◽

Memory Contents ◽

Past Experiences ◽

The Value Function

This paper contributes on designing robotic self-optimizing memory controller for non-Markovian reinforcement tasks. Rather than holistic search for the whole memory contents the model adopts associated feature analysis to successively memorize a newly event state-action pair as an action of past experience. Actor-Critic learning is used to adaptively tuning the control parameters, while on-line variant of random forests (RF) learner is used as memory-capable to approximate the policy of Actor and the value function of Critic. Learning capability of the proposed model is experimentally examined through non-markovian cart-pole balancing task. The result shows that our self-optimizing memory controller acquired complex behaviors such as balancing two poles simultaneously, displays long-term planning and generalization capacity based on past experiences.

Download Full-text

Basel III and the Net Stable Funding Ratio

ISRN Applied Mathematics ◽

10.1155/2013/582707 ◽

2013 ◽

Vol 2013 ◽

pp. 1-20 ◽

Cited By ~ 3

Author(s):

F. Gideon ◽

Mark A. Petersen ◽

Janine Mukuddem-Petersen ◽

LNP Hlatshwayo

Keyword(s):

Optimal Control ◽

Optimal Control Problem ◽

Continuous Time ◽

Value Function ◽

Analytic Solution ◽

Market Liquidity ◽

Basel Iii ◽

Quantitative Manner ◽

The Value Function

We validate the new Basel liquidity standards as encapsulated by the net stable funding ratio in a quantitative manner. In this regard, we consider the dynamics of inverse net stable funding ratio as a measure to quantify the bank’s prospects for a stable funding over a period of a year. In essence, this justifies how Basel III liquidity standards can be effectively implemented in mitigating liquidity problems. We also discuss various classes of available stable funding and required stable funding. Furthermore, we discuss an optimal control problem for a continuous-time inverse net stable funding ratio. In particular, we make optimal choices for the inverse net stable funding targets in order to formulate its cost. This is normally done by obtaining analytic solution of the value function. Finally, we provide a numerical example for the dynamics of the inverse net stable funding ratio to identify trends in which banks behavior convey forward looking information on long-term market liquidity developments.

Download Full-text

Reinforcement Learning for Optimizing Driving Policies on Cruising Taxis Services

Sustainability ◽

10.3390/su12218883 ◽

2020 ◽

Vol 12 (21) ◽

pp. 8883

Author(s):

Kun Jin ◽

Wei Wang ◽

Xuedong Hua ◽

Wei Zhou

Keyword(s):

Reinforcement Learning ◽

Value Function ◽

State Action ◽

Future Reward ◽

Long Run ◽

Markov Decision ◽

Action Value ◽

Data Expansion ◽

Taking Action ◽

The Value Function

As the key element of urban transportation, taxis services significantly provide convenience and comfort for residents’ travel. However, the reality has not shown much efficiency. Previous researchers mainly aimed to optimize policies by order dispatch on ride-hailing services, which cannot be applied in cruising taxis services. This paper developed the reinforcement learning (RL) framework to optimize driving policies on cruising taxis services. Firstly, we formulated the drivers’ behaviours as the Markov decision process (MDP) progress, considering the influences after taking action in the long run. The RL framework using dynamic programming and data expansion was employed to calculate the state-action value function. Following the value function, drivers can determine the best choice and then quantify the expected future reward at a particular state. By utilizing historic orders data in Chengdu, we analysed the function value’s spatial distribution and demonstrated how the model could optimize the driving policies. Finally, the realistic simulation of the on-demand platform was built. Compared with other benchmark methods, the results verified that the new model performs better in increasing total revenue, answer rate and decreasing waiting time, with the relative percentages of 4.8%, 6.2% and −27.27% at most.

Download Full-text

Optimal Occupation in the Complete Graph

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800002989 ◽

1993 ◽

Vol 7 (3) ◽

pp. 369-385 ◽

Cited By ~ 1

Author(s):

Kyle Siegrist

Keyword(s):

Markov Decision Process ◽

Complete Graph ◽

Decision Process ◽

Value Function ◽

Comparison Result ◽

State Action ◽

Optimal Policies ◽

Markov Decision ◽

The Cost ◽

The Value Function

We consider N sites (N ≤ ∞), each of which may be either occupied or unoccupied. Time is discrete, and at each time unit a set of occupied sites may attempt to capture a previously unoccupied site. The attempt will be successful with a probability that depends on the number of sites making the attempt, in which case the new site will also be occupied. A benefit is gained when new sites are occupied, but capture attempts are costly. The problem of optimal occupation is formulated as a Markov decision process in which the admissible actions are occupation strategies and the cost is a function of the strategy and the number of occupied sites. A partial order on the state-action pairs is used to obtain a comparison result for stationary policies and qualitative results concerning monotonicity of the value function for the n-stage problem (n ≤ ∞). The optimal policies are partially characterized when the cost depends on the action only through the total number of occupation attempts made.

Download Full-text

The value function for time-related decisions

PsycEXTRA Dataset ◽

10.1037/e653632011-006 ◽

2011 ◽

Author(s):

Anouk Festjens ◽

Siegfried Dewitte ◽

Enrico Diecidue ◽

Sabrina Bruyneel

Keyword(s):

Value Function ◽

The Value Function

Download Full-text

A Modified Dugoff Tire Model for Combined-slip Forces

Tire Science and Technology ◽

10.2346/1.3481696 ◽

2010 ◽

Vol 38 (3) ◽

pp. 228-244 ◽

Cited By ~ 19

Author(s):

Nenggen Ding ◽

Saied Taheri

Keyword(s):

Vehicle Dynamics ◽

Tire Model ◽

Magic Formula ◽

Model Based ◽

Proposed Model ◽

Road Friction ◽

On Line ◽

Double Lane Change ◽

Tire Forces ◽

Tire Models

Abstract Easy-to-use tire models for vehicle dynamics have been persistently studied for such applications as control design and model-based on-line estimation. This paper proposes a modified combined-slip tire model based on Dugoff tire. The proposed model takes emphasis on less time consumption for calculation and uses a minimum set of parameters to express tire forces. Modification of Dugoff tire model is made on two aspects: one is taking different tire/road friction coefficients for different magnitudes of slip and the other is employing the concept of friction ellipse. The proposed model is evaluated by comparison with the LuGre tire model. Although there are some discrepancies between the two models, the proposed combined-slip model is generally acceptable due to its simplicity and easiness to use. Extracting parameters from the coefficients of a Magic Formula tire model based on measured tire data, the proposed model is further evaluated by conducting a double lane change maneuver, and simulation results show that the trajectory using the proposed tire model is closer to that using the Magic Formula tire model than Dugoff tire model.

Download Full-text

Aspects of the Evolution of the Romanian Tourists’ Preferences Concerning the Domestic Tourist Destinations

Valahian Journal of Economic Studies ◽

10.2478/vjes-2019-0002 ◽

2019 ◽

Vol 10 (1) ◽

pp. 21-28

Author(s):

Aniela Bălăcescu ◽

Radu Șerban Zaharia

Keyword(s):

Logistic Model ◽

Local Level ◽

Medium Term ◽

Tourist Destinations ◽

Tourism Management ◽

Production And Consumption ◽

On Line ◽

The One ◽

Long Term Studies

Abstract Tourist services represent a category of services in which the inseparability of production and consumption, the inability to be storable, the immateriality, and last but not least non-durability, induces in tourism management a number of peculiarities and difficulties. Under these circumstances the development of medium-term strategies involves long-term studies regarding on the one hand the developments and characteristics of the demand, and on the other hand the tourist potential analysis at regional and local level. Although in the past 20 years there has been tremendous growth of on-line booking made by household users, the tour operators agencies as well as those with sales activity continue to offer the specific services for a large number of tourists, that number, in the case of domestic tourism, increased by 1.6 times in case of the tour operators and by 4.44 times in case of the agencies with sales activity. At the same time, there have been changes in the preferences of tourists regarding their holiday destinations in Romania. Started on these considerations, paper based on a logistic model, examines the evolution of the probabilities and scores corresponding to the way the Romanian tourists spend their holidays on the types of tourism agencies, actions and tourist areas in Romania.

Download Full-text

SUMBER-SUMBER RESILIENSI PADA REMAJA AKHIR YANG MENGALAMI KEKERASAN DARI ORANGTUA PADA MASA KANAK-KANAK

Psibernetika ◽

10.30813/psibernetika.v11i1.1161 ◽

2018 ◽

Vol 11 (1) ◽

Author(s):

Devina Calista ◽

Garvin Garvin

Keyword(s):

Late Adolescents ◽

Long Term Effects ◽

Short Term ◽

Parental Affection ◽

Resilience Factors ◽

Childhood Violence ◽

Past Experiences ◽

The Impact ◽

Depth Interviews

Child abuse by parents is common in households. The impact of violence on children will bring short-term effects and long-term effects that can be attributed to their various emotional, behavioral and social problems in the future; especially in late adolescence that will enter adulthood. Resilience factors increase the likelihood that adolescents who are victims of childhood violence recover from their past experiences, become more powerful individuals and have a better life. The purpose of this study was to determine the source of resilience in late adolescents who experienced violence from parents in their childhood. This research uses qualitative research methods with in-depth interviews as a method of data collection. The result shows that the three research participants have the aspects of "I Have", "I Am", and "I Can"; a participant has "I Can" aspects as a source of resilience, and one other subject has no source of resilience. The study concluded that parental affection and acceptance of the past experience have role to the three sources of resilience (I Have, I Am, and I Can) Keyword : Resilience, adolescence, violence, parents

Download Full-text

Designing a CRM Framework for online & offline sports footwear retailers

"Carácter" Revista Cientifica de la Universidad Del Pacifico ISSN 2602-8476 ◽

10.35936/caracter.v3i1.41 ◽

2015 ◽

Vol 3 (1) ◽

Author(s):

Rodrigo Cueva ◽

Guillem Rufian ◽

Maria Gabriela Valdes

Keyword(s):

Environmental Changes ◽

Customer Relationship ◽

Business Strategies ◽

Economic Sectors ◽

Running Shoes ◽

Brick And Mortar ◽

Key Factor ◽

On Line ◽

The Impact

The use of Customer Relationship Managers to foster customers loyalty has become one of the most common business strategies in the past years. However, CRM solutions do not fill the abundance of happily ever-after relationships that business needs, and each client’s perception is different in the buying process. Therefore, the experience must be precise, in order to extend the loyalty period of a customer as much as possible. One of the economic sectors in which CRM’s have improved this experience is retailing, where the personalized attention to the customer is a key factor. However, brick and mortar experiences are not enough to be aware in how environmental changes could affect the industry trends in the long term. A base unified theoretical framework must be taken into consideration, in order to develop an adaptable model for constructing or implementing CRMs into companies. Thanks to this approximation, the information is complemented, and the outcome will increment the quality in any Marketing/Sales initiative. The goal of this article is to explore the different factors grouped by three main domains within the impact of service quality, from a consumer’s perspective, in both on-line and off-line retailing sector. Secondly, we plan to go a step further and extract base guidelines about previous analysis for designing CRM’s solutions focused on the loyalty of the customers for a specific retailing sector and its product: Sports Running Shoes.

Download Full-text

The Equal Tails: A Method to Elicit the Value Function

SSRN Electronic Journal ◽

10.2139/ssrn.893748 ◽

2006 ◽

Author(s):

Manel Baucells ◽

Antonio Villasis

Keyword(s):

Value Function ◽

The Value Function

Download Full-text

Fusing Dynamic Images and Depth Motion Maps for Action Recognition in Surveillance Systems

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327909666191209155141 ◽

2019 ◽

Vol 09 ◽

Author(s):

Rajat Khurana ◽

Alok Kumar Singh Kushwaha

Keyword(s):

Action Recognition ◽

Depth Map ◽

Good Choice ◽

Surveillance Systems ◽

Activity Detection ◽

Learning Capability ◽

Proposed Model ◽

Care Activity ◽

Increasing Demand ◽

Depth Motion Maps

Background & Objective: Identification of human actions from video has gathered much attention in past few years. Most of the computer vision tasks such as Health Care Activity Detection, Suspicious Activity detection, Human Computer Interactions etc. are based on the principle of activity detection. Automatic labelling of activity from videos frames is known as activity detection. Motivation of this work is to use most out of the data generated from sensors and use them for recognition of classes. Recognition of actions from videos sequences is a growing field with the upcoming trends of deep neural networks. Automatic learning capability of Convolutional Neural Network (CNN) make them good choice as compared to traditional handcrafted based approaches. With the increasing demand of RGB-D sensors combination of RGB and depth data is in great demand. This work comprises of the use of dynamic images generated from RGB combined with depth map for action recognition purpose. We have experimented our approach on pre trained VGG-F model using MSR Daily activity dataset and UTD MHAD Dataset. We achieve state of the art results. To support our research, we have calculated different parameters apart from accuracy such as precision, F score, recall. Conclusion: Accordingly, the investigation confirms improvement in term of accuracy, precision, F-Score and Recall. The proposed model is 4 Stream model is prone to occlusion, used in real time and also the data from the RGB-D sensor is fully utilized.

Download Full-text