Novel Approaches for Integrating MART1 Clustering Based Pre-Fetching Technique with Web Caching

Author(s):  
Sathiyamoorthi ◽  
Murali Bhaskaran

Web caching and Web pre-fetching are two important techniques for improving the performance of Web based information retrieval system. These two techniques would complement each other, since Web caching provides temporal locality whereas Web pre-fetching provides spatial locality of Web objects. However, if the web caching and pre-fetching are integrated inefficiently, this might cause increasing the network traffic as well as the Web server load. Conventional policies are most suitable only for memory caching since it involves fixed page size. But when one deals with web caching which involves pages of different size. Hence one need an efficient algorithm that works better in web cache environment. Moreover conventional replacement policies are not suitable in clustering based pre-fetching environment since multiple objects were pre-fetched. Hence, it cannot be handled by conventional algorithms. Therefore, care must be taken while integrating web caching with web pre-fetching technique in order to overcome these limitations. In this paper, novel algorithms have been proposed for integrating web caching with clustering based pre-fetching technique. Here Modified ART1 has been used for clustering based pre-fetching technique. The proposed algorithm outperforms the traditional algorithms in terms of hit rate and number of objects to be pre-fetched. Hence saves bandwidth.

Author(s):  
V. Sathiyamoorthi

Network congestion remains one of the main barriers to the continuing success of the internet and Web based services. In this background, proxy caching is one of the most successful solutions for civilizing the performance of Web since it reduce network traffic, Web server load and improves user perceived response time. Here, the most popular Web objects that are likely to be revisited in the near future are stored in the proxy server thereby it improves the Web response time and saves network bandwidth. The main component of Web caching is it cache replacement policy. It plays a key role in replacing existing objects when there is no room for new one especially when cache is full. Moreover, the conventional replacement policies are used in Web caching environments which provide poor network performance. These policies are suitable for memory caching since it involves fixed sized objects. But, Web caching which involves objects of varying size and hence there is a need for an efficient policy that works better in Web cache environment. Moreover, most of the existing Web caching policies have considered few factors and ignored the factors that have impact on the efficiency of Web proxy caching. Hence, it is decided to propose a novel policy for Web cache environment. The proposed policy includes size, cost, frequency, ageing, time of entry into the cache and popularity of Web objects in cache removal policy. It uses the Web usage mining as a technique to improve Web caching policy. Also, empirical analyses shows that proposed policy performs better than existing policies in terms of various performance metrics such as hit rate and byte hit rate.


1997 ◽  
Author(s):  
L. Rodney Long ◽  
Stanley R. Pillemer ◽  
Reva C. Lawrence ◽  
Gin-Hua Goh ◽  
Leif Neve ◽  
...  

2019 ◽  
Vol 12 ◽  
pp. 1-10
Author(s):  
Kar Tim Chan

World Wide Web is an information retrieval system accessible via the Internet. Since all the web resources and documents are interlinks with hypertext links, it formed a huge and complex information network. Besides information, the web is also a primary tool for commercial, entertainment and connecting people around the world. Hence, studying its network topology will give us a better understanding of the sociology of content on the web as well as the possibility of predicting new emerging phenomena. In this paper, we construct networks by using random walk process that traverses the web at two popular websites, namely google.com (global) and mudah.my (local). We perform measurement such as degree distribution, diameter and average path length on the networks to determine various structural properties. We also analyse the network at the domain level to identify some top-level domains appearing in both networks in order to understand the connectivity of the web in different regions. Using centrality analysis, we also reveal some important and popular websites and domain from the networks.


Author(s):  
Sathiyamoorthi V. ◽  
Suresh P. ◽  
Jayapandian N. ◽  
Kanmani P. ◽  
Deva Priya M. ◽  
...  

With an increasing number of web users, the data traffic generated by these users generates tremendous network traffic which takes a long time to connect with the web server. The main reason is, the distance between the client making requests and the servers responding to those requests. The use of the CDN (content delivery network) is one of the strategies for minimizing latency. But, it incurs additional cost. Alternatively, web caching and preloading are the most viable approaches to this issue. It is therefore decided to introduce a novel web caching strategy called optimized popularity-aware modified least frequently used (PMLFU) policy for information retrieval based on users' past access history and their trends analysis. It helps to enhance the proxy-driven web caching system by analyzing user access requests and caching the most popular web pages driven on their preferences. Experimental results show that the proposed systems can significantly reduce the user delay in accessing the web page. The performance of the proposed system is measured using IRCACHE data sets in real time.


Author(s):  
Mouhcine El Hassani ◽  
Noureddine Falih ◽  
Belaid Bouikhalene

As information becomes increasingly abundant and accessible on the web, researchers do not have a need to go to excavate books in the libraries. These require a knowledge extraction system from the text (KEST). The goal of authors in this chapter is to identify the needs of a person to do a search in a text, which can be unstructured, and retrieve the terms of information related to the subject of research then structure them into classes of useful information. These may subsequently identify the general architecture of an information retrieval system from text documents in order to develop it and finally identify the parameters to evaluate its performance and the results retrieved.


Sign in / Sign up

Export Citation Format

Share Document