An Approach for Interesting Subgraph Mining from Web Log Data Using W-Gaston Algorithm
Graph-Based Data Mining (GBDM) is an emerging research topic nowadays, for the retrieval of the essential information from the graph database. There exist many algorithms that find frequent patterns in a given graph database. One such algorithm, GASTON uses support based on frequency to discover frequent patterns. The discovery phase in the Gaston algorithm is time-consuming, and the pages captured the interest of the users are ignored by the existing GASTON algorithm. This paper proposes an algorithm, Weighted-Gaston (W-Gaston) algorithm, by modifying the existing Gaston algorithm. Here, four interesting measures are developed based on the frequency, entropy, and the page duration, for the retrieval of the interesting sub-graphs. The proposed interesting measures include four types of support: (1) Support based on the page duration (W-Support), (2) Support based on the entropy (E-Support), (3) Support based on the page duration and the entropy (WE-Support), and (4) Support based on the frequency, page duration, and the entropy (FWE-Support). The simulation of the proposed work is done using the MSNBC and the weblog databases. The experimental results show that the proposed algorithm performed well as compared with the existing algorithms.