Query Recommendation Using Large-Scale Web Access Logs and Web Page Archive

Author(s):  
Lin Li ◽  
Shingo Otsuka ◽  
Masaru Kitsuregawa
2002 ◽  
Vol 22 (1Supplement) ◽  
pp. 111-114
Author(s):  
Yumi YAMAGUCHI ◽  
Yuko IKEHATA ◽  
Takayuki ITOH ◽  
Yasumasa KAJINAGA

2004 ◽  
pp. 305-334 ◽  
Author(s):  
Yannis Manolopoulos ◽  
Mikolaj Morzy ◽  
Tadeusz Morzy ◽  
Alexandros Nanopoulos ◽  
Marek Wojciechowski ◽  
...  

Access histories of users visiting a web server are automatically recorded in web access logs. Conceptually, the web-log data can be regarded as a collection of clients’ access-sequences, where each sequence is a list of pages accessed by a single user in a single session. This chapter presents novel indexing techniques that support efficient processing of so-called pattern queries, which consist of finding all access sequences that contain a given subsequence. Pattern queries are a key element of advanced analyses of web-log data, especially those concerning typical navigation schemes. In this chapter, we discuss the particularities of efficiently processing user access-sequences with pattern queries, compared to the case of searching unordered sets. Extensive experimental results are given, which examine a variety of factors and illustrate the superiority of the proposed methods over indexing techniques for unordered data adapted to access sequences.


2019 ◽  
Vol 23 (22) ◽  
pp. 11947-11965
Author(s):  
Te-En Wei ◽  
Hahn-Ming Lee ◽  
Albert B. Jeng ◽  
Hemank Lamba ◽  
Christos Faloutsos

Web Mining ◽  
2011 ◽  
pp. 373-392 ◽  
Author(s):  
Yew-Kwong Woon ◽  
Wee-Keong Ng ◽  
Ee-Peng Lim

The rising popularity of electronic commerce makes data mining an indispensable technology for several applications, especially online business competitiveness. The World Wide Web provides abundant raw data in the form of Web access logs. However, without data mining techniques, it is difficult to make any sense out of such massive data. In this chapter, we focus on the mining of Web access logs, commonly known as Web usage mining. We analyze algorithms for preprocessing and extracting knowledge from such logs. We will also propose our own techniques to mine the logs in a more holistic manner. Experiments conducted on real Web server logs verify the practicality as well as the efficiency of the proposed techniques as compared to an existing technique. Finally, challenges in Web usage mining are discussed.


Author(s):  
Yannis Manolopoulos ◽  
Alexandros Nanopoulos ◽  
Mikolaj Morzy ◽  
Tadeusz Morzy ◽  
Marek Wojciechowski ◽  
...  

Web servers have recently become the main source of information on the Internet. Every Web server uses a Web log to automatically record access of its users. Each Web-log entry represents a single user’s access to a Web resource (e.g., HTML document) and contains the client’s IP address, the timestamp, the URL address of the requested resource, and some additional information. An example log file is depicted in Figure 1. Each row contains the IP address of the requesting client, the timestamp of the request, the name of the method used with the URL of the resource, the return code issued by the server, and the size of the requested object.


2000 ◽  
Author(s):  
Anupam Joshi ◽  
Raghu Krishnapuram

Author(s):  
Dirk Spennemann

The increased commercialisation of Internet domain sales created the unanticipated side effect that domain extensions no longer signify the residence of the domain user. As a result, the analysis of the domain attributes in the Web access logs no longer provides accurate information on the origin of the users and thus of the geographical ‘reach’ of a given site. This study provides an alternative method to assess the geographical ‘reach’ by calculating the average demand for Web pages in hourly intervals originating from each time zone. The resulting analysis tool, which relates to Greenwich Mean Time, is location independent and can be applied to Web sites world wide.


Sign in / Sign up

Export Citation Format

Share Document