scholarly journals Subterranean Insect based Data Reduction in Web Usage Mining using K-implies Clustering Algorithm

Information decrease is the way toward limiting the measure of information that should be put away in an information stockpiling condition. Information decrease can build stockpiling effectiveness and lessen costs. Information cleaning act in the Data Preprocessing and Web Usage Mining. The work on information cleaning of web server logs, unessential things and futile information can not totally evacuated and Overlapped information causes trouble during information recovering from database. Right now, we present Ant Based Pattern Clustering Algorithm to get design information for mining .It likewise shows Log Cleaner that can sift through a lot of superfluous, conflicting information dependent on the basic of their URLs. Fundamentally right now are expelling undesirable records . so we are utilizing k-implies bunching calculation . By utilizing this exploration work we can apply this philosophy on web based business stage i.e AMAZON, FLIPKART.

Author(s):  
P. K. Nizar Banu ◽  
H. Inbarani

As websites increase in complexity, locating needed information becomes a difficult task. Such difficulty is often related to the websites’ design but also ineffective and inefficient navigation processes. Research in web mining addresses this problem by applying techniques from data mining and machine learning to web data and documents. In this study, the authors examine web usage mining, applying data mining techniques to web server logs. Web usage mining has gained much attention as a potential approach to fulfill the requirement of web personalization. In this paper, the authors propose K-means biclustering, rough biclustering and fuzzy biclustering approaches to disclose the duality between users and pages by grouping them in both dimensions simultaneously. The simultaneous clustering of users and pages discovers biclusters that correspond to groups of users that exhibit highly correlated ratings on groups of pages. The results indicate that the fuzzy C-means biclustering algorithm best and is able to detect partial matching of preferences.


Author(s):  
Stu Westin

Studies that rely on Web usage mining can be experimental or observational in nature. The focus of such studies is quite varied and may involve such topics as predicting online purchase intentions (Hooker & Finkelman, 2004; Moe, 2003; Montgomery, Li, Srinivsan, & Liechty, 2004), designing recommender systems for e-commerce products and sites (Cho & Kim, 2004; Kim & Cho, 2003), understanding navigation and search behavior (Chiang, Dholakia, & Westin, 2004; Gery & Haddad, 2003; Johnson, Moe, Fader, Bellman, & Lohse, 2004; Li & Zaiane, 2004), or a myriad of other subjects. Regardless of the issue being studied, data collection for Web usage mining studies often proves to be a vexing problem, and ideal research designs are frequently sacrificed in the interest of finding a reasonable data capture or collection mechanism. Despite the difficulties involved, the research community has recognized the value of Web-based experimental research (Saeed, Hwang, & Yi, 2003; Zinkhan, 2005), and has, in fact, called on investigators to exploit “non-intrusive means of collecting usage and exploration data” (Gao, 2003, p. 31) in future Web studies. In this article we discuss some of the methodological complexities that arise when conducting studies that involve Web usage mining. We then describe an innovative, software-based methodology that addresses many of these problems. The methods described here are most applicable to experimental studies, but they can be applied in ex-post observational research settings, as well.


Author(s):  
P. K. Nizar Banu ◽  
H. Inbarani

As websites increase in complexity, locating needed information becomes a difficult task. Such difficulty is often related to the websites’ design but also ineffective and inefficient navigation processes. Research in web mining addresses this problem by applying techniques from data mining and machine learning to web data and documents. In this study, the authors examine web usage mining, applying data mining techniques to web server logs. Web usage mining has gained much attention as a potential approach to fulfill the requirement of web personalization. In this paper, the authors propose K-means biclustering, rough biclustering and fuzzy biclustering approaches to disclose the duality between users and pages by grouping them in both dimensions simultaneously. The simultaneous clustering of users and pages discovers biclusters that correspond to groups of users that exhibit highly correlated ratings on groups of pages. The results indicate that the fuzzy C-means biclustering algorithm best and is able to detect partial matching of preferences.


Author(s):  
Bamshad Mobasher

Web usage mining refers to the automatic discovery and analysis of patterns in clickstream and associated data collected or generated as a result of user interactions with Web resources on one or more Web sites. The goal of Web usage mining is to capture, model, and analyze the behavioral patterns and profiles of users interacting with a Web site. Analyzing such data can help these organizations determine the lifetime value of clients, design cross marketing strategies across products and services, evaluate the effectiveness of promotional campaigns, optimize the functionality of Web-based applications, provide more personalized content to visitors, and find the most effective logical structure for their Web space.


Author(s):  
Yongjian Fu

With the rapid development of the World Wide Web or the Web, many organizations now put their information on the Web and provide Web-based services such as online shopping, user feedback, technical support, and so on. Understanding Web usage through data mining techniques is recognized as an important area.


Web Mining ◽  
2011 ◽  
pp. 373-392 ◽  
Author(s):  
Yew-Kwong Woon ◽  
Wee-Keong Ng ◽  
Ee-Peng Lim

The rising popularity of electronic commerce makes data mining an indispensable technology for several applications, especially online business competitiveness. The World Wide Web provides abundant raw data in the form of Web access logs. However, without data mining techniques, it is difficult to make any sense out of such massive data. In this chapter, we focus on the mining of Web access logs, commonly known as Web usage mining. We analyze algorithms for preprocessing and extracting knowledge from such logs. We will also propose our own techniques to mine the logs in a more holistic manner. Experiments conducted on real Web server logs verify the practicality as well as the efficiency of the proposed techniques as compared to an existing technique. Finally, challenges in Web usage mining are discussed.


Author(s):  
XIANGJI HUANG

A common problem in mining association rules or sequential patterns is that a large number of rules or patterns can be generated from a database, making it impossible for a human analyst to digest the results. Solutions to the problem include, among others, using interestingness measures to identify interesting rules or patterns and pruning rules that are considered redundant. Various interestingness measures have been proposed, but little work has been reported on the effectiveness of the measures on real-world applications. We present an application of Web usage mining to a large collection of Livelink log data. Livelink is a web-based product of Open Text Corporation, which provides automatic management and retrieval of different types of information objects over an intranet, an extranet or the Internet. We report our experience in preprocessing raw log data, mining association rules and sequential patterns from the log data, and identifying interesting rules and patterns by use of interestingness measures and some pruning methods. In particular, we evaluate a number of interestingness measures in terms of their effectiveness in finding interesting association rules and sequential patterns. Our results show that some measures are much more effective than others.


Sign in / Sign up

Export Citation Format

Share Document