Web usage mining based on probabilistic latent semantic analysis

Author(s):  
Xin Jin ◽  
Yanzan Zhou ◽  
Bamshad Mobasher
Author(s):  
Guandong Xu

Nowadays Web users are facing the problems of information overload and drowning due to the significant and rapid growth in the amount of information and the large number of users. As a result, how to provide Web users more exactly needed information is becoming a critical issue in Web-based information retrieval and data management. In order to address the above difficulties, Web mining was proposed as an efficient means to discover the intrinsic relationships among Web data. In particular, Web usage mining is to discover Web usage patterns and utilize the discovered usage knowledge for constructing interest-oriented user communities, which could be, in turn, used for presenting Web users more personalized Web contents, i.e. Web recommendation. On the other hand, Latent Semantic Analysis (LSA) is one kind of approaches that is used to reveal the inherent correlation resided in co-occurrence activities, such as Web usage data. Moreover, LSA possesses the capability of capturing the hidden knowledge at semantic level that can’t be achieved by traditional methods. In this chapter, we aim to address building user communities of interests via combining Web usage mining and latent semantic analysis. Meanwhile we also present the application of user communities for Web recommendation.


2011 ◽  
Vol 219-220 ◽  
pp. 887-891
Author(s):  
Jiang Zhong ◽  
Yi Feng Cheng ◽  
Shi Tao Deng

Web usage mining technique is widely used for Web recommendation, which customizes Web content to user-preferred style. Traditional techniques of Web usage mining can only discover usage pattern explicitly. In order to employ the users’ feature and web pages’ attributes to get more accuracy recommendation, we propose a unified collaborative filtering model for web recommendation which combined the latent and external features of users and web page through back propagation neural networks. In the algorithm, we employ Probabilistic Latent Semantic Analysis (PLSA) method to get latent features. The main advantages of this technique over standard memory-based methods are the higher accuracy, constant time prediction, and an explicit and compact model representation. The preliminary experimental evaluation shows that substantial improvements in accuracy over existing methods can be obtained.


Sign in / Sign up

Export Citation Format

Share Document