Web Usage Mining

Web Mining ◽  
2011 ◽  
pp. 373-392 ◽  
Author(s):  
Yew-Kwong Woon ◽  
Wee-Keong Ng ◽  
Ee-Peng Lim

The rising popularity of electronic commerce makes data mining an indispensable technology for several applications, especially online business competitiveness. The World Wide Web provides abundant raw data in the form of Web access logs. However, without data mining techniques, it is difficult to make any sense out of such massive data. In this chapter, we focus on the mining of Web access logs, commonly known as Web usage mining. We analyze algorithms for preprocessing and extracting knowledge from such logs. We will also propose our own techniques to mine the logs in a more holistic manner. Experiments conducted on real Web server logs verify the practicality as well as the efficiency of the proposed techniques as compared to an existing technique. Finally, challenges in Web usage mining are discussed.

Author(s):  
P. K. Nizar Banu ◽  
H. Inbarani

As websites increase in complexity, locating needed information becomes a difficult task. Such difficulty is often related to the websites’ design but also ineffective and inefficient navigation processes. Research in web mining addresses this problem by applying techniques from data mining and machine learning to web data and documents. In this study, the authors examine web usage mining, applying data mining techniques to web server logs. Web usage mining has gained much attention as a potential approach to fulfill the requirement of web personalization. In this paper, the authors propose K-means biclustering, rough biclustering and fuzzy biclustering approaches to disclose the duality between users and pages by grouping them in both dimensions simultaneously. The simultaneous clustering of users and pages discovers biclusters that correspond to groups of users that exhibit highly correlated ratings on groups of pages. The results indicate that the fuzzy C-means biclustering algorithm best and is able to detect partial matching of preferences.


Author(s):  
Wen-Chen Hu ◽  
Hung-Jen Yang ◽  
Chung-wei Lee ◽  
Jyh-haw Yeh

World Wide Web data mining includes content mining, hyperlink structure mining, and usage mining. All three approaches attempt to extract knowledge from the Web, produce some useful results from the knowledge extracted, and apply the results to certain real-world problems. The first two apply the data mining techniques to Web page contents and hyperlink structures, respectively. The third approach, Web usage mining (the theme of this article), is the application of data mining techniques to the usage logs of large Web data repositories in order to produce results that can be applied to many practical subjects, such as improving Web sites/pages, making additional topic or product recommendations, user/customer behavior studies, and so forth. This article provides a survey and analysis of current Web usage mining technologies and systems. A Web usage mining system must be able to perform five major functions: (i) data gathering, (ii) data preparation, (iii) navigation pattern discovery, (iv) pattern analysis and visualization, and (v) pattern applications. Many Web usage mining technologies have been proposed, and each technology employs a different approach. This article first describes a generalized Web usage mining system, which includes five individual functions. Each system function is then explained and analyzed in detail. Related surveys of Web usage mining techniques also can be found in Hu, et al. (2003) and Kosala and Blockeel (2000).


Author(s):  
Yongjian Fu

With the rapid development of the World Wide Web or the Web, many organizations now put their information on the Web and provide Web-based services such as online shopping, user feedback, technical support, and so on. Understanding Web usage through data mining techniques is recognized as an important area.


Author(s):  
Dirk Spennemann

The increased commercialisation of Internet domain sales created the unanticipated side effect that domain extensions no longer signify the residence of the domain user. As a result, the analysis of the domain attributes in the Web access logs no longer provides accurate information on the origin of the users and thus of the geographical ‘reach’ of a given site. This study provides an alternative method to assess the geographical ‘reach’ by calculating the average demand for Web pages in hourly intervals originating from each time zone. The resulting analysis tool, which relates to Greenwich Mean Time, is location independent and can be applied to Web sites world wide.


2012 ◽  
Vol 3 (1) ◽  
pp. 144-148 ◽  
Author(s):  
Ketul Patel ◽  
Dr. A. R. Patel

The traffic on World Wide Web is increasing rapidly and huge amount of data is generated due to users’ numerous interactions with web sites. Web Usage Mining is the application of data mining techniques to discover the useful and interesting patterns from web usage data. It supports to know frequently accessed pages, predict user navigation, improve web site structure etc. In order to apply Web Usage Mining, various steps are performed. This paper discusses the process of Web Usage Mining consisting steps: Data Collection, Pre-processing, Pattern Discovery and Pattern Analysis. It has also presented Web Usage Mining applications and some Web Mining software.


Author(s):  
Xiangji Huang

With the rapid growth of the World Wide Web, the use of automated Web-mining techniques to discover useful and relevant information has become increasingly important. One challenging direction is Web usage mining, wherein one attempts to discover user navigation patterns of Web usage from Web access logs. Properly exploited, the information obtained from Web usage log can assist us to improve the design of a Web site, refine queries for effective Web search, and build personalized search engines. However, Web log data are usually large in size and extremely detailed, because they are likely to record every aspect of a user request to a Web server. It is thus of great importance to process the raw Web log data in an appropriate way, and identify the target information intelligently. In this chapter, we first briefly review the concept of Web Usage Mining and discuss its difference from classic Knowledge Discovery techniques, and then focus on exploiting Web log sessions, defined as a group of requests made by a single user for a single navigation purpose, in Web usage mining. We also compare some of the state-of-the-art techniques in identifying log sessions from Web servers, and present some popular Web mining techniques, including Association Rule Mining, Clustering, Classification, Collaborative Filtering, and Sequential Pattern Learning, that can be exploited on the Web log data for different research and application purposes.


Author(s):  
P. K. Nizar Banu ◽  
H. Inbarani

As websites increase in complexity, locating needed information becomes a difficult task. Such difficulty is often related to the websites’ design but also ineffective and inefficient navigation processes. Research in web mining addresses this problem by applying techniques from data mining and machine learning to web data and documents. In this study, the authors examine web usage mining, applying data mining techniques to web server logs. Web usage mining has gained much attention as a potential approach to fulfill the requirement of web personalization. In this paper, the authors propose K-means biclustering, rough biclustering and fuzzy biclustering approaches to disclose the duality between users and pages by grouping them in both dimensions simultaneously. The simultaneous clustering of users and pages discovers biclusters that correspond to groups of users that exhibit highly correlated ratings on groups of pages. The results indicate that the fuzzy C-means biclustering algorithm best and is able to detect partial matching of preferences.


Sign in / Sign up

Export Citation Format

Share Document