Using the XCS Classifier System for Multi-objective Reinforcement Learning Problems
Keyword(s):
We investigate the performance of a learning classifier system in some simple multi-objective, multi-step maze problems, using both random and biased action-selection policies for exploration. Results show that the choice of action-selection policy can significantly affect the performance of the system in such environments. Further, this effect is directly related to population size, and we relate this finding to recent theoretical studies of learning classifier systems in single-step problems.
2021 ◽
Vol 9
(9)
◽
pp. 1239-1245
2010 ◽
Vol 20
(1)
◽
pp. 157-174
◽
2009 ◽
Vol 13
(6)
◽
pp. 640-648
2009 ◽
Vol 17
(3)
◽
pp. 307-342
◽
2002 ◽
Vol 10
(2)
◽
pp. 185-205
◽
2003 ◽
Vol 11
(3)
◽
pp. 279-298
◽