A Generic Markov Decision Process Model and Reinforcement Learning Method for Scheduling Agile Earth Observation Satellites

Author(s):  
Yongming He ◽  
Lining Xing ◽  
Yingwu Chen ◽  
Witold Pedrycz ◽  
Ling Wang ◽  
...  
2010 ◽  
Vol 44-47 ◽  
pp. 3611-3615 ◽  
Author(s):  
Zhi Cong Zhang ◽  
Kai Shun Hu ◽  
Hui Yu Huang ◽  
Shuai Li ◽  
Shao Yong Zhao

Reinforcement learning (RL) is a state or action value based machine learning method which approximately solves large-scale Markov Decision Process (MDP) or Semi-Markov Decision Process (SMDP). A multi-step RL algorithm called Sarsa(,k) is proposed, which is a compromised variation of Sarsa and Sarsa(). It is equivalent to Sarsa if k is 1 and is equivalent to Sarsa() if k is infinite. Sarsa(,k) adjust its performance by setting k value. Two forms of Sarsa(,k), forward view Sarsa(,k) and backward view Sarsa(,k), are constructed and proved equivalent in off-line updating.


2009 ◽  
Vol 13 (2) ◽  
pp. 538-542 ◽  
Author(s):  
Nagahisa Kogawa ◽  
Masanao Obayashi ◽  
Kunikazu Kobayashi ◽  
Takashi Kuremoto

2004 ◽  
Vol 38 (1) ◽  
pp. 107-118 ◽  
Author(s):  
Jason H. Goto ◽  
Mark E. Lewis ◽  
Martin L. Puterman

Sign in / Sign up

Export Citation Format

Share Document