scholarly journals Computing the Entropy of User Navigation in the Web

Author(s):  
Mark Levene ◽  
George Loizou

Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be useful in providing navigation assistance. Herein, we give a theoretical underpinning of user navigation in terms of the entropy of an underlying Markov chain modelling the web topology. We present a novel method for online incremental computation of the entropy and a large deviation result regarding the length of a trail to realize the said entropy. We provide an error analysis for our estimation of the entropy in terms of the divergence between the empirical and actual probabilities. We then indicate applications of our algorithm in the area of web data mining. Finally, we present an extension of our technique to higher-order Markov chains by a suitable reduction of a higher-order Markov chain model to a first-order one.

2018 ◽  
Vol 19 (3) ◽  
pp. 449
Author(s):  
A. G. C. Pereira ◽  
F. A. S. Sousa ◽  
B. B. Andrade ◽  
Viviane Simioli Medeiros Campos

The aim of this study is to get further into the two-state Markov chain model for synthetic generation daily streamflows. The model proposed in Aksoy and Bayazit (2000) and Aksoy (2003) is based on a two Markov chains for determining the state of the stream. The ascension curve of the hydrograph is modeled by a two-parameter Gamma probability distribution function and is assumed that a recession curve of the hydrograph follows an exponentially function. In this work, instead of assuming a pre-defined order for the Markov chains involved in the modelling of streamflows, a BIC test is performed to establish the Markov chain order that best fit on the data. The methodology was applied to data from seven Brazilian sites. The model proposed here was  better than that one proposed by Aksoy but for two sites which have the lowest time series and are located in the driest regions.


Author(s):  
Ratnesh Kumar Jain ◽  
Rahul Singhai

Web server log file contains information about every access to the web pages hosted on a server like when they were requested, the Internet Protocol (IP) address of the request, the error code, the number of bytes sent to the user, and the type of browser used. Web servers can also capture referrer logs, which show the page from which a visitor makes the next request. As the visit to web site is increasing exponentially the web logs are becoming huge data repository which can be mined to extract useful information for decision making. In this chapter, we proposed a Markov chain based method to categorize the users into faithful, Partially Impatient and Completely Impatient user. And further, their browsing behavior is analyzed. We also derived some theorems to study the browsing behavior of each user type and then some numerical illustrations are added to show how their behavior differs as per categorization. At the end we extended this work by approximating the theorems.


2014 ◽  
Vol 13 (04) ◽  
pp. 721-753 ◽  
Author(s):  
Suresh Shirgave ◽  
Prakash Kulkarni ◽  
José Borges

The rapid growth of the World Wide Web has resulted in intricate Web sites, demanding enhanced user skills to find the required information and more sophisticated tools that are able to generate apt recommendations. Markov Chains have been widely used to generate next-page recommendations; however, accuracy of such models is limited. Herein, we propose the novel Semantic Variable Length Markov Chain Model (SVLMC) that combines the fields of Web Usage Mining and Semantic Web by enriching the Markov transition probability matrix with rich semantic information extracted from Web pages. We show that the method is able to enhance the prediction accuracy relatively to usage-based higher order Markov models and to semantic higher order Markov models based on ontology of concepts. In addition, the proposed model is able to handle the problem of ambiguous predictions. An extensive experimental evaluation was conducted on two real-world data sets and on one partially generated data set. The results show that the proposed model is able to achieve 15–20% better accuracy than the usage-based Markov model, 8–15% better than the semantic ontology Markov model and 7–12% better than semantic-pruned Selective Markov Model. In summary, the SVLMC is the first work proposing the integration of a rich set of detailed semantic information into higher order Web usage Markov models and experimental results reveal that the inclusion of detailed semantic data enhances the prediction ability of Markov models.


2016 ◽  
Vol 19 (1) ◽  
pp. 21-35 ◽  
Author(s):  
S. Suresh ◽  
K. Senthamarai Kannan ◽  
P. Venkatesan

Sign in / Sign up

Export Citation Format

Share Document