A Graph-Based Evolutionary Algorithm: Genetic Network Programming (GNP) and Its Extension Using Reinforcement Learning

2007 ◽  
Vol 15 (3) ◽  
pp. 369-398 ◽  
Author(s):  
Shingo Mabu ◽  
Kotaro Hirasawa ◽  
Jinglu Hu

This paper proposes a graph-based evolutionary algorithm called Genetic Network Programming (GNP). Our goal is to develop GNP, which can deal with dynamic environments efficiently and effectively, based on the distinguished expression ability of the graph (network) structure. The characteristics of GNP are as follows. 1) GNP programs are composed of a number of nodes which execute simple judgment/processing, and these nodes are connected by directed links to each other. 2) The graph structure enables GNP to re-use nodes, thus the structure can be very compact. 3) The node transition of GNP is executed according to its node connections without any terminal nodes, thus the past history of the node transition affects the current node to be used and this characteristic works as an implicit memory function. These structural characteristics are useful for dealing with dynamic environments. Furthermore, we propose an extended algorithm, “GNP with Reinforcement Learning (GNPRL)” which combines evolution and reinforcement learning in order to create effective graph structures and obtain better results in dynamic environments. In this paper, we applied GNP to the problem of determining agents' behavior to evaluate its effectiveness. Tileworld was used as the simulation environment. The results show some advantages for GNP over conventional methods.

Author(s):  
Hiroyuki Hatakeyama ◽  
◽  
Shingo Mabu ◽  
Kotaro Hirasawa ◽  
Jinglu Hu ◽  
...  

A new graph-based evolutionary algorithm named “Genetic Network Programming, GNP” has been already proposed. GNP represents its solutions as graph structures, which can improve the expression ability and performance. In addition, GNP with Reinforcement Learning (GNP-RL) was proposed a few years ago. Since GNP-RL can do reinforcement learning during task execution in addition to evolution after task execution, it can search for solutions efficiently. In this paper, GNP with Actor-Critic (GNP-AC) which is a new type of GNP-RL is proposed. Originally, GNP deals with discrete information, but GNP-AC aims to deal with continuous information. The proposed method is applied to the controller of the Khepera simulator and its performance is evaluated.


Author(s):  
Yan Chen ◽  
◽  
Shingo Mabu ◽  
Kaoru Shimada ◽  
Kotaro Hirasawa

In this paper, the Genetic Network Programming (GNP) for creating trading rules on stocks is described. GNP is an evolutionary computation, which represents its solutions using graph structures and has some useful features inherently. It has been clarified that GNP works well especially in dynamic environments since GNP can create quite compact programs and has an implicit memory function. In this paper, GNP is applied to creating a stock trading model. There are three important points: The first important point is to combine GNP with Sarsa Learning which is one of the reinforcement learning algorithms. Evolution-based methods evolve their programs after task execution because they must calculate fitness values, while reinforcement learning can change programs during task execution, therefore the programs can be created efficiently. The second important point is that GNP uses candlestick chart and selects appropriate technical indices to judge the timing of the buying and selling stocks. The third important point is that sub-nodes are used in each node to determine appropriate actions (buying/selling) and to select appropriate stock price information depending on the situation. In the simulations, the trading model is trained using the stock prices of 16 brands in 2001, 2002 and 2003. Then the generalization ability is tested using the stock prices in 2004. From the simulation results, it is clarified that the trading rules of the proposed method obtain much higher profits than Buy&Hold method and its effectiveness has been confirmed.


Author(s):  
QingBiao Meng ◽  
◽  
Shingo Mabu ◽  
Lu Yu ◽  
Kotaro Hirasawa

Taxi service usually benefits people by providing comfortable and flexible ride experiences. However, an inherent problem, the insufficient number of taxis at traffic peak, has baffled taxi service ever since it existed. This paper hereby proposes a Multi-Customer Taxi Dispatch System (MCTDS), where taxis are granted a right to take customers with different Origin-Destination (OD) pairs simultaneously, to shorten the total waiting time and traveling time. In addition, to mitigate the damage of detours, MCTDS is built based on Genetic Network Programming, a graph-based evolutionary algorithm that has shown excellent performances previously in some complicated applications. We also modify the structure of GNP to achieve an improvement in performance. In the simulation part, we demonstrate that MCTDS outperforms the conventional GNP and some heuristic taxi dispatch approaches.


Author(s):  
Lutao Wang ◽  
◽  
Shingo Mabu ◽  
Fengming Ye ◽  
Shinji Eto ◽  
...  

Genetic Network Programming (GNP) is an evolutionary algorithm derived from GA and GP. Directed graph structure, reusability of nodes, and implicit memory function enable GNP to deal with complex problems in dynamic environments efficiently and effectively, as many paper demonstrated. This paper proposed a new method to optimize GNP by extracting and using rules. The basic idea of GNP with Rule Accumulation (GNP with RA) is to extract rules with higher fitness values from the elite individuals and store them in the pool every generation. A rule is defined as a sequence of successive judgment results and a processing node, which represent the good experiences of the past behavior. As a result, the rule pool serves as an experience set of GNP obtained in the evolutionary process. By extracting the rules during the evolutionary period and then matching them with the situations of the environment, we could, for example, guide agents' behavior properly and get better performance of the agents. In this paper, we apply GNP with RA to the problem of determining agents' behaviors in the Tile-world environment in order to evaluate its effectiveness. The simulation results demonstrate that GNP with RA could have better performances than the conventional GNP both in the average fitness value and its stability.


Sign in / Sign up

Export Citation Format

Share Document