An Approach for Interesting Subgraph Mining from Web Log Data Using W-Gaston Algorithm

Graph-Based Data Mining (GBDM) is an emerging research topic nowadays, for the retrieval of the essential information from the graph database. There exist many algorithms that find frequent patterns in a given graph database. One such algorithm, GASTON uses support based on frequency to discover frequent patterns. The discovery phase in the Gaston algorithm is time-consuming, and the pages captured the interest of the users are ignored by the existing GASTON algorithm. This paper proposes an algorithm, Weighted-Gaston (W-Gaston) algorithm, by modifying the existing Gaston algorithm. Here, four interesting measures are developed based on the frequency, entropy, and the page duration, for the retrieval of the interesting sub-graphs. The proposed interesting measures include four types of support: (1) Support based on the page duration (W-Support), (2) Support based on the entropy (E-Support), (3) Support based on the page duration and the entropy (WE-Support), and (4) Support based on the frequency, page duration, and the entropy (FWE-Support). The simulation of the proposed work is done using the MSNBC and the weblog databases. The experimental results show that the proposed algorithm performed well as compared with the existing algorithms.

Download Full-text

The Application and Improvement of ID3 Algorithm in WEB Log Data Mining

Lecture Notes in Electrical Engineering - Advanced Multimedia and Ubiquitous Engineering ◽

10.1007/978-981-10-1536-6_75 ◽

2016 ◽

pp. 577-583 ◽

Cited By ~ 1

Author(s):

Weihua Feng ◽

Xingquan Cai

Keyword(s):

Data Mining ◽

Log Data ◽

Web Log ◽

Id3 Algorithm

Download Full-text

Research of Massive Web Log Data Mining Based on Cloud Computing

2013 International Conference on Computational and Information Sciences ◽

10.1109/iccis.2013.162 ◽

2013 ◽

Cited By ~ 8

Author(s):

Zhen Qi Wang ◽

Hai Long Li

Keyword(s):

Data Mining ◽

Cloud Computing ◽

Log Data ◽

Web Log

Download Full-text

Ethical aspects of web log data mining

International Journal of Information Technology and Management ◽

10.1504/ijitm.2008.016605 ◽

2008 ◽

Vol 7 (2) ◽

pp. 190 ◽

Cited By ~ 2

Author(s):

David L. Olson

Keyword(s):

Data Mining ◽

Ethical Aspects ◽

Log Data ◽

Web Log

Download Full-text

Clients Behavior Analysis of Securities Company Based on the Web-Log Data Mining

2011 International Conference on Management and Service Science ◽

10.1109/icmss.2011.5999197 ◽

2011 ◽

Author(s):

Yahong Li ◽

Jian Li ◽

Kegang Hao

Keyword(s):

Data Mining ◽

Behavior Analysis ◽

Log Data ◽

Web Log ◽

Securities Company ◽

The Web

Download Full-text

Web log data mining based on association rule

2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) ◽

10.1109/fskd.2011.6019878 ◽

2011 ◽

Cited By ~ 1

Author(s):

Jian Liu ◽

Yan-Qing Wang

Keyword(s):

Data Mining ◽

Association Rule ◽

Log Data ◽

Web Log

Download Full-text

HIGH CANDIDATES GENERATION: A NEW EFFICIENT METHOD FOR MINING SHARE-FREQUENT PATTERNS

Jurnal Teknologi ◽

10.11113/jt.v79.10292 ◽

2017 ◽

Vol 79 (7) ◽

Author(s):

Chayanan Nawapornanan ◽

Sarun Intakosum ◽

Veera Boonjing

Keyword(s):

Data Mining ◽

Efficient Method ◽

Execution Time ◽

Experimental Results ◽

Closure Property ◽

Frequent Patterns ◽

Research Issue ◽

Important Research ◽

Useful Knowledge ◽

Downward Closure

The share frequent patterns mining is more practical than the traditional frequent patternset mining because it can reflect useful knowledge such as total costs and profits of patterns. Mining share-frequent patterns becomes one of the most important research issue in the data mining. However, previous algorithms extract a large number of candidate and spend a lot of time to generate and test a large number of useless candidate in the mining process. This paper proposes a new efficient method for discovering share-frequent patterns. The new method reduces a number of candidates by generating candidates from only high transaction-measure-value patterns. The downward closure property of transaction-measure-value patterns assures correctness of the proposed method. Experimental results on dense and sparse datasets show that the proposed method is very efficient in terms of execution time. Also, it decreases the number of generated useless candidates in the mining process by at least 70%.

Download Full-text

Finding Generalized Path Patterns for Web Log Data Mining

Current Issues in Databases and Information Systems - Lecture Notes in Computer Science ◽

10.1007/3-540-44472-6_17 ◽

2000 ◽

pp. 215-228 ◽

Cited By ~ 14

Author(s):

Alex Nanopoulos ◽

Yannis Manolopoulos

Keyword(s):

Data Mining ◽

Log Data ◽

Web Log

Download Full-text

Analysis of Web Log Data Mining Based on Improved Fuzzy Clustering Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.760-762.1896 ◽

2013 ◽

Vol 760-762 ◽

pp. 1896-1901 ◽

Cited By ~ 1

Author(s):

Chuan Qi Chen

Keyword(s):

Data Mining ◽

Best Practices ◽

Fuzzy Clustering ◽

Clustering Analysis ◽

Clustering Algorithm ◽

Pattern Mining ◽

Log Data ◽

Web Log ◽

Fuzzy Clustering Analysis ◽

Fuzzy Clustering Algorithm

Fuzzy clustering analysis is a clustering algorithm based on function best practices, technology and optimal cost function using calculus. Fuzzy clustering, each sample is no longer belong to a class, but belong to a certain degree of membership of each class. In this paper, Web log sequential pattern mining knowledge gained, and visitors have the same browsing mode access to cutting the interaction of users with the Web information space. The paper presents analysis of Web log data mining based on improved fuzzy clustering algorithm. The experiment demonstrates the improved algorithm has better scalability.

Download Full-text

FP-outlier: Frequent pattern based outlier detection

Computer Science and Information Systems ◽

10.2298/csis0501103h ◽

2005 ◽

Vol 2 (1) ◽

pp. 103-118 ◽

Cited By ~ 86

Author(s):

Zengyou He ◽

Xiaofei Xu ◽

Zhexue Huang ◽

Shengchun Deng

Keyword(s):

Data Mining ◽

Outlier Detection ◽

Frequent Itemsets ◽

Research Community ◽

Experimental Results ◽

New Method ◽

Frequent Pattern ◽

Data Detection ◽

Frequent Patterns ◽

Data Set

An outlier in a dataset is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of such outliers is important for many applications and has recently attracted much attention in the data mining research community. In this paper, we present a new method to detect outliers by discovering frequent patterns (or frequent itemsets) from the data set. The outliers are defined as the data transactions that contain less frequent patterns in their itemsets. We define a measure called FPOF (Frequent Pattern Outlier Factor) to detect the outlier transactions and propose the FindFPOF algorithm to discover outliers. The experimental results have shown that our approach outperformed the existing methods on identifying interesting outliers.

Download Full-text

COMPARISON OF DECISION AND RANDOM TREE ALGORITHMS ON A WEB LOG DATA FOR FINDING FREQUENT PATTERNS

International Journal of Research in Engineering and Technology ◽

10.15623/ijret.2014.0319029 ◽

2014 ◽

Vol 03 (19) ◽

pp. 155-161

Author(s):

A.Jameela .

Keyword(s):

Random Tree ◽

Frequent Patterns ◽

Log Data ◽

Web Log ◽

Tree Algorithms

Download Full-text