scholarly journals A Simulation of the Structure of the World-Wide Web

2002 ◽  
Vol 7 (1) ◽  
pp. 9-25 ◽  
Author(s):  
Moses Boudourides ◽  
Gerasimos Antypas

In this paper we are presenting a simple simulation of the Internet World-Wide Web, where one observes the appearance of web pages belonging to different web sites, covering a number of different thematic topics and possessing links to other web pages. The goal of our simulation is to reproduce the form of the observed World-Wide Web and of its growth, using a small number of simple assumptions. In our simulation, existing web pages may generate new ones as follows: First, each web page is equipped with a topic concerning its contents. Second, links between web pages are established according to common topics. Next, new web pages may be randomly generated and subsequently they might be equipped with a topic and be assigned to web sites. By repeated iterations of these rules, our simulation appears to exhibit the observed structure of the World-Wide Web and, in particular, a power law type of growth. In order to visualise the network of web pages, we have followed N. Gilbert's (1997) methodology of scientometric simulation, assuming that web pages can be represented by points in the plane. Furthermore, the simulated graph is found to possess the property of small worlds, as it is the case with a large number of other complex networks.

Author(s):  
Vijay Kasi ◽  
Radhika Jain

In the context of the Internet, a search engine can be defined as a software program designed to help one access information, documents, and other content on the World Wide Web. The adoption and growth of the Internet in the last decade has been unprecedented. The World Wide Web has always been applauded for its simplicity and ease of use. This is evident looking at the extent of the knowledge one requires to build a Web page. The flexible nature of the Internet has enabled the rapid growth and adoption of it, making it hard to search for relevant information on the Web. The number of Web pages has been increasing at an astronomical pace, from around 2 million registered domains in 1995 to 233 million registered domains in 2004 (Consortium, 2004). The Internet, considered a distributed database of information, has the CRUD (create, retrieve, update, and delete) rule applied to it. While the Internet has been effective at creating, updating, and deleting content, it has considerably lacked in enabling the retrieval of relevant information. After all, there is no point in having a Web page that has little or no visibility on the Web. Since the 1990s when the first search program was released, we have come a long way in terms of searching for information. Although we are currently witnessing a tremendous growth in search engine technology, the growth of the Internet has overtaken it, leading to a state in which the existing search engine technology is falling short. When we apply the metrics of relevance, rigor, efficiency, and effectiveness to the search domain, it becomes very clear that we have progressed on the rigor and efficiency metrics by utilizing abundant computing power to produce faster searches with a lot of information. Rigor and efficiency are evident in the large number of indexed pages by the leading search engines (Barroso, Dean, & Holzle, 2003). However, more research needs to be done to address the relevance and effectiveness metrics. Users typically type in two to three keywords when searching, only to end up with a search result having thousands of Web pages! This has made it increasingly hard to effectively find any useful, relevant information. Search engines face a number of challenges today requiring them to perform rigorous searches with relevant results efficiently so that they are effective. These challenges include the following (“Search Engines,” 2004). 1. The Web is growing at a much faster rate than any present search engine technology can index. 2. Web pages are updated frequently, forcing search engines to revisit them periodically. 3. Dynamically generated Web sites may be slow or difficult to index, or may result in excessive results from a single Web site. 4. Many dynamically generated Web sites are not able to be indexed by search engines. 5. The commercial interests of a search engine can interfere with the order of relevant results the search engine shows. 6. Content that is behind a firewall or that is password protected is not accessible to search engines (such as those found in several digital libraries).1 7. Some Web sites have started using tricks such as spamdexing and cloaking to manipulate search engines to display them as the top results for a set of keywords. This can make the search results polluted, with more relevant links being pushed down in the result list. This is a result of the popularity of Web searches and the business potential search engines can generate today. 8. Search engines index all the content of the Web without any bounds on the sensitivity of information. This has raised a few security and privacy flags. With the above background and challenges in mind, we lay out the article as follows. In the next section, we begin with a discussion of search engine evolution. To facilitate the examination and discussion of the search engine development’s progress, we break down this discussion into the three generations of search engines. Figure 1 depicts this evolution pictorially and highlights the need for better search engine technologies. Next, we present a brief discussion on the contemporary state of search engine technology and various types of content searches available today. With this background, the next section documents various concerns about existing search engines setting the stage for better search engine technology. These concerns include information overload, relevance, representation, and categorization. Finally, we briefly address the research efforts under way to alleviate these concerns and then present our conclusion.


2020 ◽  
Vol 18 (06) ◽  
pp. 1119-1125 ◽  
Author(s):  
Kessia Nepomuceno ◽  
Thyago Nepomuceno ◽  
Djamel Sadok

1997 ◽  
Vol 3 (5) ◽  
pp. 276-280
Author(s):  
Nicholas P. Poolos

There has been an explosion in the number of World Wide Web sites on the Internet dedicated to neuroscience. With a little direction, it is possible to navigate around the Web and find databases containing information indispensable to both basic and clinical neuroscientists. This article reviews some Web sites of particular interest. NEUROSCIENTIST 3:276–280, 1997


10.28945/2556 ◽  
2002 ◽  
Author(s):  
Sanjeev Phukan

Issues of IT Ethics have recently become immensely more complex. The capacity to place material on the World Wide Web has been acquired by a very large number of people. As evolving software has gently hidden the complexities and frustrations that were involved in writing HTML, more and more web sites are being created by people with a relatively modest amount of computer literacy. At the same time, once the initial reluctance to use the Internet and the World Wide Web for commercial purposes had been overcome, sites devoted to doing business on the Internet mushroomed and e-commerce became a term permanently to be considered part of common usage. The assimilation of new technology is almost never smooth. As the Internet begins to grow out of its abbreviated infancy, a multitude of new issues surface continually, and a large proportion of these issues remain unresolved. Many of these issues contain a strong ethics content. As the ability to reach millions of people instantly and simultaneously has passed into the hands of the average person, the rapid emergence of thorny ethical issues is likely to continue unabated.


Author(s):  
Kai-Hsiang Yang

This chapter will address the issues of Uniform Resource Locator (URL) correction techniques in proxy servers. The proxy servers are more and more important in the World Wide Web (WWW), and they provide Web page caches for browsing the Web pages quickly, and also reduce unnecessary network traffic. Traditional proxy servers use the URL to identify their cache, and it is a cache-miss when the request URL is non-existent in its caches. However, for general users, there must be some regularity and scope in browsing the Web. It would be very convenient for users when they do not need to enter the whole long URL, or if they still could see the Web content even though they forgot some part of the URL, especially for those personal favorite Web sites. We will introduce one URL correction mechanism into the personal proxy server to achieve this goal.


Author(s):  
Carmine Sellitto

This chapter provides an overview of some of the criteria that are currently being used to assess medical information found on the World Wide Web (WWW). Drawing from the evaluation frameworks discussed, a simple set of easy to apply criteria is proposed for evaluating on-line medical information. The criterion covers the categories of information accuracy, objectivity, privacy, currency and authority. A checklist for web page assessment and scoring is also proposed, providing an easy to use tool for medical professionals, health consumers and medical web editors.


Author(s):  
Américo Sampaio

The growth of the Internet and the World Wide Web has contributed to significant changes in many areas of our society. The Web has provided new ways of doing business, and many companies have been offering new services as well as migrating their systems to the Web. The main goal of the first Web sites was to facilitate the sharing of information between computers around the world. These Web sites were mainly composed of simple hypertext documents containing information in text format and links to other documents that could be spread all over the world. The first users of this new technology were university researchers interested in some easier form of publishing their work, and also searching for other interesting research sources from other universities.


Author(s):  
Samantha Bax

“Portal technologies” in recent times have become a catchphrase within information technology circles. The concept of the “portal” (more commonly termed Internet portal), has initially been used to refer to Web sites, which presented the user with the ability to access rich content, resources, and services on the World Wide Web (Kakumanu & Mezzacca, 2005; Smith, 2004; White, 2000). As such, the Internet portal provides its users with a one-stop entry point to the resources of the World Wide Web.


1998 ◽  
Vol 112 (9) ◽  
pp. 854-859 ◽  
Author(s):  
Ahmed A. Saada

AbstractAdvances in telecommunications technology in the last decade have fostered the development of computer networks that allow access to vast amounts of information and services. The most prominent is the Internet (Glowniak, 1995). Medical information is increasingly available on such computer networks. The purpose of the present article is to provide an update to previously published otolaryngology sites (Johns, 1996; Burton and Johns, 1996) available on the World Wide Web, and to provide the otolaryngologist with details of resources that are accessible via the Internet. However, the reader should also be aware that the uniform resource locator (URL) addresses of Web sites can change without warning.


Author(s):  
Brian M. Katt ◽  
Ludovico Lucenti ◽  
Nailah F. Mubin ◽  
Michael Nakashian ◽  
Daniel Fletcher ◽  
...  

Abstract Introduction The use of the internet for health-related information continues to increase. Because of its decentralized structure, information contained within the World Wide Web is not regulated. The purpose of the present study is to evaluate the type and quality of information on the internet regarding Kienböck’s disease. We hypothesized that the information available on the World Wide Web would be of good informational value. Materials and Methods The search phrase “Kienböck’s disease” was entered into the five most commonly used internet search engines. The top 49 nonsponsored Web sites identified by each search engine were collected. Each unique Web site was evaluated for authorship and content, and an informational score ranging from 0 to 100 points was assigned. Each site was reviewed by two fellowship-trained hand surgeons. Results The informational mean score for the sites was 45.5 out of a maximum of 100 points. Thirty-one (63%) of the Web sites evaluated were authored by an academic institution or a physician. Twelve (24%) of the sites were commercial sites or sold commercial products. The remaining 6 Web sites (12%) were noninformational, provided unconventional information, or had lay authorship. The average informational score on the academic or physician authored Web sites was 54 out of 100 points, compared with 38 out of 100 for the remainder of the sites. This difference was statistically significant. Conclusion While the majority of the Web sites evaluated were authored by academic institutions or physicians, the informational value contained within is of limited completeness. More than one quarter of the Web sites were commercial in nature. There remains significant room for improvement in the completeness of information available for common hand conditions in the internet.


Sign in / Sign up

Export Citation Format

Share Document