Design of the Distributed Web Crawler

2011 ◽  
Vol 204-210 ◽  
pp. 1454-1458
Author(s):  
Xing Chen ◽  
Wei Jiang Li ◽  
Tie Jun Zhao ◽  
Xing Hai Piao

On the current scale of the Internet, the single web crawler is unable to visit the entire web in an effective time-frame. So, we develop a distributed web crawler system to deal with it. In our distribution design, we mainly consider two facets of parallel. One is the multi-thread in the internal nodes; the other is distributed parallel among the nodes. We focus on the distribution and parallel between nodes. We address two issues of the distributed web crawler which include the crawl strategy and dynamic configuration. The results of experiment show that the hash function based on the web site achieves the goal of the distributed web crawler. At the same time, we pursue the load balance of the system, we also should reduce the communication and management spending as much as possible.

Author(s):  
Petar Halachev ◽  
Victoria Radeva ◽  
Albena Nikiforova ◽  
Miglena Veneva

This report is dedicated to the role of the web site as an important tool for presenting business on the Internet. Classification of site types has been made in terms of their application in the business and the types of structures in their construction. The Models of the Life Cycle for designing business websites are analyzed and are outlined their strengths and weaknesses. The stages in the design, construction, commissioning, and maintenance of a business website are distinguished and the activities and requirements of each stage are specified.


1997 ◽  
Vol 11 (1) ◽  
pp. 21-27 ◽  
Author(s):  
M.P.S.F. Gomes ◽  
J.H. Vaux ◽  
J-N. Ezingeard ◽  
R.J. Grieve ◽  
P. Race ◽  
...  

The authors discuss issues relating to the feasibility of a Web-based database for facilitating communications between university researchers and industry. They have constructed an experimental Web-based Technology Bank that provides examples of university research which might be of interest to manufacturing companies. They are using this database as a focus of discussion on the usefulness of electronic communications for technology dissemination. The portfolio of research products, and the Web site on which it is housed, are currently being presented in a series of workshops for senior executives in small and medium sized manufacturing companies. Views are also being gathered from technology intermediaries. Analysis of the data so far has highlighted potential problems in disseminating information on the Internet and has also enabled the authors to identify and understand users' profiles.


Author(s):  
Lauren Rosewarne

Despite the widespread embrace of the Internet and the second nature way we each turn to Google for information, to social media to see our friends, to netporn and Netflix for recreation, film and television tells a very different story. On screen, a character dating online, gaming online or shopping online, invariably serves as a clue that they’re somewhat troubled: they may be a socially excluded nerd at one end of the spectrum, through to being a paedophile or homicidal maniac seeking prey at the other. On screen, the Internet is frequently presented as a clue, a risk factor and a rationale for a character’s deviance or danger. While the Internet has come to play a significant role in screen narratives, an undercurrent of many depictions – in varying degrees of fervour – is that the Web is complicated, elusive and potentially even hazardous. This paper draws from research conducted for my book Cyberbullies, Cyberactivists, Cyberpredators: Film, TV, and Internet Stereotypes (Rosewarne, 2016). While that volume provided an analysis of the denizens of the Internet through the examination of over 500 film and television examples – profiling screen stereotypes such as netgeeks, neckbeards, and netaddicts – this paper focuses on some of the recurring themes in portrayals of the Internet, shedding light on the how, and perhaps most importantly why, the fear of the technology is so common. This paper presents a series of themes used to frame the Internet as negative on screen including dehumanisation, the Internet as a badlands, the Web as possessing inherent vulnerabilities and the cyberbogeyman.


2010 ◽  
pp. 2298-2309
Author(s):  
Justin Meza ◽  
Qin Zhu

Knowledge is the fact or knowing something from experience or via association. Knowledge organization is the systematic management and organization of knowledge (Hodge, 2000). With the advent of Web 2.0, Mashups have become a hot new thing on the Web. A mashup is a Web site or a Web application that combines content from more than one source and delivers it in an integrated way (Fichter, 2006). In this article, we will first explore the concept of mashups and look at the components of a mashup. We will provide an overview of various mashups on the Internet. We will look at literature about knowledge and the knowledge organization. Then, we will elaborate on our experiment of a mashup in an enterprise environment. We will describe how we mixed the content from two sets of sources and created a new source: a novel way of organizing and displaying HP Labs Technical Reports. The findings from our project will be included and some best practices for creating enterprise mashups will be given. The future of enterprise mashups will be discussed as well.


2030 ◽  
2010 ◽  
Author(s):  
Rutger van Santen ◽  
Djan Khoe ◽  
Bram Vermeer

Our lives seem to revolve around schedules. If we don’t honor them with second-to-second precision, we miss our trains and our workplace rosters fall apart. We’re reliant on one another, and we constantly have to coordinate our schedules with those of others. Planning is crucial to our industry, too. If you unexpectedly run out of nuts and bolts, you can’t make any more cars, and the entire production process grinds to a halt. No manufacturer can afford that, so industrial companies employ large teams of specialists whose job is to ensure there are never any shortages of key parts. A worldwide logistic network has become our industry’s lifeblood. The central issue facing logistics is that of reliability. How do you keep your supply network intact? And how do you limit the consequences if it fails? These are questions that go far beyond the supply of nuts and bolts for new cars. Reliable logistics touches equally on the web of interactions that determine food production and the optimization of the Internet. It also extends to power supply, telecommunications, and workforce. Reliable networks make our society tick. But they face uncertainties of various kinds. That lends a broader significance to insights gained from industrial logistics, which offer us tools we can use to optimize networks and account for uncertainties in other areas as well. The reliability of a supply network is intimately bound up with the inventories you need to maintain. Businesses hold millions of dollars’ worth of supplies in their warehouses to make absolutely certain they never cease production due to a failure in the supply chain. So the key question is how large a stock do you need to hold of each component? Smart planning to hold down inventory levels in your warehouse generates immediate savings. On the other hand, you need enough stock to ensure continuity should anything go wrong. Optimizing storage is a common problem in supply networks. There is always a trade-off between the reliability of the network and the need for it to be profitable in an economic sense.


Author(s):  
S. Park

Based on the weekly data of listings and Web site usage of eBay and Yahoo!Auctions, as well as fee schedules and available auction mechanisms, this chapter provides empirical support of the network effect in Internet auctions: A seller’s expected auction revenue increases with page views per listing on one hand and increased listings raise page views per listing on the other hand. The existence of the network effect between Web site usage and listings explains the first mover’s advantage and the dominance of eBay even with higher fees in the Internet auctions market. Our empirical findings also highlight unique features of Internet auctions, especially in the entry behavior of potential bidders into specific auctions, inviting more theoretical studies of the market microstructure of Internet auctions.


Author(s):  
Shaoyi He

The World Wide Web (the Web), a distributed hypermedia information system that provides global access to the Internet, has been most widely used for exchanging information, providing services, and doing business across national boundaries. It is difficult to find out exactly when the first multilingual Web site was up and running on the Internet, but as early as January 1, 1993, EuroNews, the first multilingual Web site in Europe, was launched to simultaneously cover world news from a European perspective in seven languages: English, French, German, Italian, Portuguese, Russian, and Spanish. (EuroNews, 2005). In North America, Web site multilinguality has become an important aspect of electronic commerce (e-commerce) as more and more Fortune 500 companies rely on the Internet and the Web to reach out to millions of customers and clients. Having a successful multilingual Web site goes beyond just translating the original Web content into different languages for different locales. Besides the language issue, there are other important issues involved in Web site multilinguality: culture, technology, content, design, accessibility, usability, and management (Bingi, Mir, & Khamalah, 2000; Dempsey, 1999; Hillier, 2003; Lindenberg, 2003; MacLeod, 2000). This article will briefly address the issues related to: (1) language that is one of the many elements conforming culture, (2) culture that greatly affects the functionality and communication of multilingual Web sites, and (3) technology that enables the multilingual support of e-commerce Web sites, focusing on the challenges and strategies of Web site multilinguality in global e-commerce.


2018 ◽  
Vol 7 (2.7) ◽  
pp. 320
Author(s):  
Dr JKR Sastry ◽  
N Sreenidhi ◽  
K Sasidhar

Information dissemination is taking place these days heavily using web sites which are hosted on the internet. The effectiveness and effi-ciency of the design of the WEB site will have great effect on the way the content hosted on the WEB can be accessed. Quality of a web site, places a vital role in making available the required information to the end user with ease satisfying the users content requirements. A framework has been proposed comprising 42 quality metrics using which the quality of a web site can be measured. Howevercompu-tations procedures have not been stated in realistic terms.In this paper, computational procedures for measuring “usability” of a WEB site can be measured which can be included into overall computation of the quality of a web site.


2007 ◽  
Vol 16 (05) ◽  
pp. 793-828 ◽  
Author(s):  
JUAN D. VELÁSQUEZ ◽  
VASILE PALADE

Understanding the web user browsing behaviour in order to adapt a web site to the needs of a particular user represents a key issue for many commercial companies that do their business over the Internet. This paper presents the implementation of a Knowledge Base (KB) for building web-based computerized recommender systems. The Knowledge Base consists of a Pattern Repository that contains patterns extracted from web logs and web pages, by applying various web mining tools, and a Rule Repository containing rules that describe the use of discovered patterns for building navigation or web site modification recommendations. The paper also focuses on testing the effectiveness of the proposed online and offline recommendations. An ample real-world experiment is carried out on a web site of a bank.


First Monday ◽  
2022 ◽  
Author(s):  
Rashika Tasnim Keya ◽  
Pietro Murano

In this paper a novel and significant study into the usability of carousel interaction in the context of desktop interaction is presented. Two equivalent prototypes in an e-commerce context were developed. One version had a carousel and the other version did not have a carousel. These were then compared with each other in an empirical experiment with 40 participants. The data collected were statistically analysed and overall results showed that in terms of performance the Web site version without carousel outperformed the version with carousel. Furthermore, the subjective preferences of the participants were strongly in favour of the without carousel version of the site. The results of this study make an important contribution to knowledge suggesting that in many cases implementing a carousel is not the best design decision. The results of this paper are particularly significant in relation to desktop versioned Web sites and goal-driven tasks. Serendipitous-type tasks and mobile versioned web sites used on mobile devices with touch screens were not part of the scope of this work.


Sign in / Sign up

Export Citation Format

Share Document