scholarly journals Extracting Network Structure for International and Malaysia Website via Random Walk

2019 ◽  
Vol 12 ◽  
pp. 1-10
Author(s):  
Kar Tim Chan

World Wide Web is an information retrieval system accessible via the Internet. Since all the web resources and documents are interlinks with hypertext links, it formed a huge and complex information network. Besides information, the web is also a primary tool for commercial, entertainment and connecting people around the world. Hence, studying its network topology will give us a better understanding of the sociology of content on the web as well as the possibility of predicting new emerging phenomena. In this paper, we construct networks by using random walk process that traverses the web at two popular websites, namely google.com (global) and mudah.my (local). We perform measurement such as degree distribution, diameter and average path length on the networks to determine various structural properties. We also analyse the network at the domain level to identify some top-level domains appearing in both networks in order to understand the connectivity of the web in different regions. Using centrality analysis, we also reveal some important and popular websites and domain from the networks.

Author(s):  
R. D. Gaharwar ◽  
D. B. Shah

Search engine are being used by the most of the World population as their basic information retrieval system for getting useful information from internet. As a service provider who uses internet for digital marketing it becomes mandatory to get high ranks from search engines. Search engines optimization (SEO) techniques are used for this purpose. Black hat SEO techniques are used for quick results but are prohibited by most of the search engines. Hence web space users or website developers should be well aware of SEO techniques and how to use them in optimal way. This paper presents some of the most commonly used black hat SEO techniques and the counter measures done by different search engines to prohibit them.


Author(s):  
Sathiyamoorthi ◽  
Murali Bhaskaran

Web caching and Web pre-fetching are two important techniques for improving the performance of Web based information retrieval system. These two techniques would complement each other, since Web caching provides temporal locality whereas Web pre-fetching provides spatial locality of Web objects. However, if the web caching and pre-fetching are integrated inefficiently, this might cause increasing the network traffic as well as the Web server load. Conventional policies are most suitable only for memory caching since it involves fixed page size. But when one deals with web caching which involves pages of different size. Hence one need an efficient algorithm that works better in web cache environment. Moreover conventional replacement policies are not suitable in clustering based pre-fetching environment since multiple objects were pre-fetched. Hence, it cannot be handled by conventional algorithms. Therefore, care must be taken while integrating web caching with web pre-fetching technique in order to overcome these limitations. In this paper, novel algorithms have been proposed for integrating web caching with clustering based pre-fetching technique. Here Modified ART1 has been used for clustering based pre-fetching technique. The proposed algorithm outperforms the traditional algorithms in terms of hit rate and number of objects to be pre-fetched. Hence saves bandwidth.


Author(s):  
Mouhcine El Hassani ◽  
Noureddine Falih ◽  
Belaid Bouikhalene

As information becomes increasingly abundant and accessible on the web, researchers do not have a need to go to excavate books in the libraries. These require a knowledge extraction system from the text (KEST). The goal of authors in this chapter is to identify the needs of a person to do a search in a text, which can be unstructured, and retrieve the terms of information related to the subject of research then structure them into classes of useful information. These may subsequently identify the general architecture of an information retrieval system from text documents in order to develop it and finally identify the parameters to evaluate its performance and the results retrieved.


2016 ◽  
Vol 43 (3) ◽  
pp. 316-327 ◽  
Author(s):  
Mohammad Sadeghi ◽  
Jesús Vegas

The performance evaluation of an information retrieval system is a decisive aspect of the measure of the improvements in search technology. The Google search engine, as a tool for retrieving information on the Web, is used by almost 92% of Iranian users. The purpose of this paper is to study Google’s performance in retrieving relevant information from Persian documents. The information retrieval effectiveness is based on the precision measures of the search results done to a website that we have built with the documents of a TREC standard corpus. We asked Google for 100 topics available on the corpus and we compared the retrieved webpages with the relevant documents. The obtained results indicated that the morphological analysis of the Persian language is not fully taken into account by the Google search engine. The incorrect text tokenisation, considering the stop words as the content keywords of a document and the wrong ‘variants encountered’ of words found by Google are the main reasons that affect the relevance of the Persian information retrieval on the Web for this search engine.


Author(s):  
Edeama O. Onwuchekwa

Since the 19th century, the world has witnessed an exponential growth in the number and variety of information products, sources, and services. This development has resulted in technological innovations for faster and more efficient processing and storage of information, as individuals and organisations strive to keep up with increasing demands. The value of information organisation cannot be overemphasized. The volume of information generated, transmitted and stored is of such immense proportion that without adequate organisation, the retrieval process would be cumbersome and frustrating. This chapter will highlight and describe the roles of an information retrieval system and the context of information organisation in several institutions. It will also discuss the various information retrieval tools and the different models used in information retrieval process. The ultimate goal of this chapter is to enable students, practicing librarians, and others interested in information services to understand the concepts, principles, and tools behind information organisation and retrieval. The conclusion of the chapter will emphasize the need for continuous evaluation of these principles and tools for sustained improvement.


2016 ◽  
Vol 14 (2) ◽  
pp. 28-34 ◽  
Author(s):  
V. Sathiyamoorthi

AbstractIt is generally observed throughout the world that in the last two decades, while the average speed of computers has almost doubled in a span of around eighteen months, the average speed of the network has doubled merely in a span of just eight months!. In order to improve the performance, more and more researchers are focusing their research in the field of computers and its related technologies. World Wide Web (WWW) acts as a medium for sharing of information. As a result, millions of applications run on the Internet and cause increased network traffic and put a great demand on the available network infrastructure. The slow retrieval of Web pages may reduce the user interest from accessing them. To deal with this problem Web caching and Web pre-fetching are used. This paper focuses on a methodology for improving the proxy-based Web caching system using Web mining. It integrates Web caching and Pre-fetching through an efficient clustering based pre-fetching technique.


Sign in / Sign up

Export Citation Format

Share Document