policy iteration algorithm Latest Research Papers

In hydro scheduling, unit commitment is a complex sub-problem. This paper proposes a new approximate dynamic programming technique to solve unit commitment. A new method called Least Square Policy Iteration (LSPI) algorithm is introduced which is efficient and faster in convergence. This algorithm takes the properties of widely used algorithm least square temporal difference (LSTD), enhance it further and make it useful for optimization problems. First value function is to find a fixed policy by using least square temporal difference Q (LSTDQ) algorithm which is similar to LSTD, then LSPI is introduced for making the policy iteration algorithm by using the results of LSTDQ. It combines the data efficiency of LSTDQ and policy-search efficiency of policy iteration.

Download Full-text

Filter based Explorized Policy Iteration Algorithm for On-Policy Approximate LQR

2019 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci44817.2019.9002891 ◽

2019 ◽

Author(s):

Sumit Kumar Jha ◽

Sayan Basu Roy ◽

Shubhendu Bhasin

Keyword(s):

Policy Iteration ◽

Iteration Algorithm ◽

Policy Iteration Algorithm

Download Full-text

Template-based Analyses and Min-policy Iteration

Formal Verification of Control Systems Software ◽

10.23943/princeton/9780691181301.003.0006 ◽

2019 ◽

pp. 111-126

Author(s):

Pierre-Loïc Garoche

Keyword(s):

Direct Synthesis ◽

Policy Iteration ◽

Sum Of Squares ◽

Iteration Algorithm ◽

First Case ◽

Policy Iteration Algorithm ◽

Multiple Templates

This chapter considers other configurations aside from the direct synthesis of invariants as bound templates. A first case arises when the methods shown in the previous chapter only synthesizes the template but not the bound. A second appears when one wants to analyze a system with multiple templates. This chapter looks at bounds on each variable and considers the templates 𝑝‎(𝑥‎) = 𝑥²‎𝑖‎ for each variable 𝑥‎𝑖‎ in state characterization 𝑥‎ ∈‎ Σ‎. The chapter thus proposes a policy iteration algorithm, based on sum-of-squares (SOS) optimization, to refine such template bounds. In practice, the chapter uses it by combining a Lyapunov-based template obtained using one of the previous methods with additional templates encoding bounds on some variables or property specific templates.

Download Full-text

policy iteration algorithm
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Policy Iteration Algorithm for Constrained Cost Optimal Control of Discrete-Time Nonlinear System

Optimal Consensus Control for Second-Order Discrete-Time Multi-Agent Systems: Using Online Policy Iteration Algorithm

Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm

A policy iteration algorithm for the American put option and free boundary control problems

Neuro-Control for Continuous-Time Stochastic Nonlinear Systems via Online Policy Iteration Algorithm

A Safety-Certified Policy Iteration Algorithm for Control of Constrained Nonlinear Systems

A Neural Network-Based Policy Iteration Algorithm with Global $$H^2$$-Superlinear Convergence for Stochastic Games on Domains

Algorithms of approximate dynamic programming for hydro scheduling

Filter based Explorized Policy Iteration Algorithm for On-Policy Approximate LQR

Template-based Analyses and Min-policy Iteration

Export Citation Format

policy iteration algorithmRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Policy Iteration Algorithm for Constrained Cost Optimal Control of Discrete-Time Nonlinear System

Optimal Consensus Control for Second-Order Discrete-Time Multi-Agent Systems: Using Online Policy Iteration Algorithm

Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm

A policy iteration algorithm for the American put option and free boundary control problems

Neuro-Control for Continuous-Time Stochastic Nonlinear Systems via Online Policy Iteration Algorithm

A Safety-Certified Policy Iteration Algorithm for Control of Constrained Nonlinear Systems

A Neural Network-Based Policy Iteration Algorithm with Global $$H^2$$-Superlinear Convergence for Stochastic Games on Domains

Algorithms of approximate dynamic programming for hydro scheduling

Filter based Explorized Policy Iteration Algorithm for On-Policy Approximate LQR

Template-based Analyses and Min-policy Iteration

policy iteration algorithm
Recently Published Documents