markov reward
Recently Published Documents


TOTAL DOCUMENTS

146
(FIVE YEARS 23)

H-INDEX

18
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Jalaj Bhandari ◽  
Daniel Russo ◽  
Raghav Singal

Temporal difference learning (TD) is a simple iterative algorithm widely used for policy evaluation in Markov reward processes. Bhandari et al. prove finite time convergence rates for TD learning with linear function approximation. The analysis follows using a key insight that establishes rigorous connections between TD updates and those of online gradient descent. In a model where observations are corrupted by i.i.d. noise, convergence results for TD follow by essentially mirroring the analysis for online gradient descent. Using an information-theoretic technique, the authors also provide results for the case when TD is applied to a single Markovian data stream where the algorithm’s updates can be severely biased. Their analysis seamlessly extends to the study of TD learning with eligibility traces and Q-learning for high-dimensional optimal stopping problems.


2021 ◽  
Vol 1792 (1) ◽  
pp. 012039
Author(s):  
Jing Cao ◽  
Xiaoqiang Liu ◽  
Hui Guo ◽  
Lizhi Cai ◽  
Yun Hu

Author(s):  
Zeng Wenbin ◽  
Shen Guixiang ◽  
Ilia B. Frenkel ◽  
Lev Khvatsckin ◽  
Igor Bolvashenkov ◽  
...  

This paper exploits the Markov reward method to availability and importance evaluation for CNC machine tools. The presented paper regards CNC machine tools as a multi-state system that state transitions process caused by elements and subsystem’s failures and corresponding maintenance activities during the lifetime. Simultaneously, considering the time-varying characteristics of the failure rates of subsystems, the non-homogeneous Markov reward model is introduced for evaluation of availability and subsystem importance identification for aging multi-state CNC machine tools under minimal repair. Corresponding procedures for the failure rates, state transition matrix, and reward matrix definition are suggested for the availability and importance measures. A numerical example is presented in order to illustrate the approach.


2020 ◽  
Vol 37 (9/10) ◽  
pp. 1301-1323
Author(s):  
Neama Temraz

PurposeIn this paper, new procedures for a fuzzy Markov reward model are introduced to find the reliability measures.Design/methodology/approachIt is supposed that the introduced system consisted of n identical units connected in parallel and each unit has m different types of failures. Also, each unit of the system is allowed to have d levels of degradation from a working state to complete failure. Non-homogeneous Markov reward model is used to construct the system of differential equations of the model. Procedures are proposed to obtain reliability measures of the model under considering that the failure and repair rates of the systems unit are fuzzy. An application is constructed to analyze a system of 2-unit, and results are shown graphically.FindingsNon-homogeneous Markov reward model is used to construct the system of differential equations of the model.Originality/valueAll papers in literature assumed Markov reward model with deterministic parameters. In this paper, a generalization of classical Markov reward model is introduced.


2019 ◽  
Vol 12 (4) ◽  
pp. 96
Author(s):  
Oni Omoyemi Abimbola ◽  
Akinyemi Bodunde Odunola ◽  
Aladesanmi Adegboye Temitope ◽  
Ganiyu Adesola Aderounmu ◽  
Kamagaté Beman Hamidja

Most of the existing solutions in cybersecurity analysis has been centered on identifying threats and vulnerabilities, and also providing suitable defense mechanisms to improve the robustness of the cyberspace network. These solutions lack effective capabilities to countermeasure the effect of risks and perform long-term prediction. In this paper, an improved risk assessment model for cyberspace security that will effectively predict and mitigate the consequences of risk was developed. Real-time vulnerabilities of a selected network were scanned and analysed and the ease of vulnerability exploitability was assessed. A Risk Assessment Model was formulated using the synergy of Absorbing Markov Chain and Markov Reward Model. The model was utilized to analyse cybersecurity state of the selected network. The proposed model was simulated using R- Statistical Package, and its performance was evaluated by benchmarking with an existing model, using Reliability and Availability as metrics. The result showed that the proposed model has higher reliability and availability over the existing model. This implied that there is a significant improvement in the assessment of security situations in a cyberspace network.


Sign in / Sign up

Export Citation Format

Share Document