Personalized Search Engine Using Binary Tree Traversal (BTT) - A Survey

Web pages have an increasing number of been used because thepatron interface of many software programsoftwarestructures. The simplicity of interplay with internet pages is an idealbenefit of the usage of them. However, the character interface also can get extracomplicatedwhilegreatercomplexnet pages are used to construct it. Understanding the complexity of net pages as perceived subjectively with the resource of clients is thereforecrucial to betterlayout this sort ofconsumer interface. Searching is one of thenot unusual placeassignmentachievedon the Internet. Search engines are the essentialtool of the net, from whereinyou willcollectassociatedstatistics and searched in keeping with the favoredkey-word given by the character. The recordson theinternet is developing dramatically. The consumer has to spend extra time with inside theinternetin case youneed to find outthe correctfactsthey may befascinated in. Existing net engines like Google do now no longerundergo in thoughtsuniqueneeds of character and serve eachpatron similarly. For this ambiguous query, some offiles on wonderfulsubjects are decreaselower backby engines like Google. Hence it will becomedifficult for the consumer to get the requiredcontent materialfabric. Moreover it additionally takes extra time in searching a pertinent content materialfabric. In this paper, we are able to survey the numerous algorithms for decreasing complexity in internetweb page navigations.

Download Full-text

Speech Engines

10.31228/osf.io/chvgu ◽

2018 ◽

Cited By ~ 3

Author(s):

James Grimmelmann

Keyword(s):

Search Engine ◽

Search Engines ◽

The Internet ◽

Internet Search ◽

High Quality ◽

New Approach ◽

To Receive ◽

Relevance Assessments

98 Minnesota Law Review 868 (2014)Academic and regulatory debates about Google are dominated by two opposing theories of what search engines are and how law should treat them. Some describe search engines as passive, neutral conduits for websites’ speech; others describe them as active, opinionated editors: speakers in their own right. The conduit and editor theories give dramatically different policy prescriptions in areas ranging from antitrust to copyright. But they both systematically discount search users’ agency, regarding users merely as passive audiences.A better theory is that search engines are not primarily conduits or editors, but advisors. They help users achieve their diverse and individualized information goals by sorting through the unimaginable scale and chaos of the Internet. Search users are active listeners, affirmatively seeking out the speech they wish to receive. Search engine law can help them by ensuring two things: access to high-quality search engines, and loyalty from those search engines.The advisor theory yields fresh insights into long-running disputes about Google. It suggests, for example, a new approach to deciding when Google should be liable for giving a website the “wrong” ranking. Users’ goals are too subjective for there to be an absolute standard of correct and incorrect rankings; different search engines necessarily assess relevance differently. But users are also entitled to complain when a search engine deliberately misleads them about its own relevance assessments. The result is a sensible, workable compromise between the conduit and editor theories.

Download Full-text

Web Crawler and Web Crawler Algorithms: A Perspective

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e9362.069520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 203-205

Keyword(s):

Search Engine ◽

Search Engines ◽

The Internet ◽

Web Pages ◽

Web Crawler ◽

Day By Day ◽

The Web

A web crawler is also called spider. For the intention of web indexing it automatically searches on the WWW. As the W3 is increasing day by day, globally the number of web pages grown massively. To make the search sociable for users, searching engine are mandatory. So to discover the particular data from the WWW search engines are operated. It would be almost challenging for mankind devoid of search engines to find anything from the web unless and until he identifies a particular URL address. A central depository of HTML documents in indexed form is sustained by every search Engine. Every time an operator gives the inquiry, searching is done at the database of indexed web pages. The size of a database of every search engine depends on the existing page on the internet. So to increase the proficiency of search engines, it is permitted to store only the most relevant and significant pages in the database.

Download Full-text

Internet Search Engines

Encyclopedia of E-Commerce, E-Government, and Mobile Commerce ◽

10.4018/978-1-59140-799-7.ch108 ◽

2011 ◽

pp. 672-677

Author(s):

Vijay Kasi ◽

Radhika Jain

Keyword(s):

Search Engine ◽

Web Sites ◽

Search Engines ◽

World Wide ◽

Relevant Information ◽

The Internet ◽

Web Pages ◽

Web Page ◽

The World ◽

The Web

In the context of the Internet, a search engine can be defined as a software program designed to help one access information, documents, and other content on the World Wide Web. The adoption and growth of the Internet in the last decade has been unprecedented. The World Wide Web has always been applauded for its simplicity and ease of use. This is evident looking at the extent of the knowledge one requires to build a Web page. The flexible nature of the Internet has enabled the rapid growth and adoption of it, making it hard to search for relevant information on the Web. The number of Web pages has been increasing at an astronomical pace, from around 2 million registered domains in 1995 to 233 million registered domains in 2004 (Consortium, 2004). The Internet, considered a distributed database of information, has the CRUD (create, retrieve, update, and delete) rule applied to it. While the Internet has been effective at creating, updating, and deleting content, it has considerably lacked in enabling the retrieval of relevant information. After all, there is no point in having a Web page that has little or no visibility on the Web. Since the 1990s when the first search program was released, we have come a long way in terms of searching for information. Although we are currently witnessing a tremendous growth in search engine technology, the growth of the Internet has overtaken it, leading to a state in which the existing search engine technology is falling short. When we apply the metrics of relevance, rigor, efficiency, and effectiveness to the search domain, it becomes very clear that we have progressed on the rigor and efficiency metrics by utilizing abundant computing power to produce faster searches with a lot of information. Rigor and efficiency are evident in the large number of indexed pages by the leading search engines (Barroso, Dean, & Holzle, 2003). However, more research needs to be done to address the relevance and effectiveness metrics. Users typically type in two to three keywords when searching, only to end up with a search result having thousands of Web pages! This has made it increasingly hard to effectively find any useful, relevant information. Search engines face a number of challenges today requiring them to perform rigorous searches with relevant results efficiently so that they are effective. These challenges include the following (“Search Engines,” 2004). 1. The Web is growing at a much faster rate than any present search engine technology can index. 2. Web pages are updated frequently, forcing search engines to revisit them periodically. 3. Dynamically generated Web sites may be slow or difficult to index, or may result in excessive results from a single Web site. 4. Many dynamically generated Web sites are not able to be indexed by search engines. 5. The commercial interests of a search engine can interfere with the order of relevant results the search engine shows. 6. Content that is behind a firewall or that is password protected is not accessible to search engines (such as those found in several digital libraries).1 7. Some Web sites have started using tricks such as spamdexing and cloaking to manipulate search engines to display them as the top results for a set of keywords. This can make the search results polluted, with more relevant links being pushed down in the result list. This is a result of the popularity of Web searches and the business potential search engines can generate today. 8. Search engines index all the content of the Web without any bounds on the sensitivity of information. This has raised a few security and privacy flags. With the above background and challenges in mind, we lay out the article as follows. In the next section, we begin with a discussion of search engine evolution. To facilitate the examination and discussion of the search engine development’s progress, we break down this discussion into the three generations of search engines. Figure 1 depicts this evolution pictorially and highlights the need for better search engine technologies. Next, we present a brief discussion on the contemporary state of search engine technology and various types of content searches available today. With this background, the next section documents various concerns about existing search engines setting the stage for better search engine technology. These concerns include information overload, relevance, representation, and categorization. Finally, we briefly address the research efforts under way to alleviate these concerns and then present our conclusion.

Download Full-text

Machine Learning as a New Search Engine Interface: An Overview

Engineering International ◽

10.18034/ei.v2i2.539 ◽

2014 ◽

Vol 2 (2) ◽

pp. 103-112 ◽

Cited By ~ 1

Author(s):

Taposh Kumar Neogy ◽

Harish Paruchuri

Keyword(s):

Machine Learning ◽

Search Engine ◽

Search Engines ◽

New World ◽

Human Factor ◽

Experimental Results ◽

The Internet ◽

Web Pages ◽

Web Page ◽

Real People

The essence of a web page is an inherently predisposed issue, one that is built on behaviors, interests, and intelligence. There are relatively a ton of reasons web pages are critical to the new world, as the matter cannot be overemphasized. The meteoric growth of the internet is one of the most potent factors making it hard for search engines to provide actionable results. With classified directories, search engines store web pages. To store these pages, some of the engines rely on the expertise of real people. Most of them are enabled and classified using automated means but the human factor is dominant in their success. From experimental results, we can deduce that the most effective and critical way to automate web pages for search engines is via the integration of machine learning.

Download Full-text

Analysis of Internet Sites for Headache

Cephalalgia ◽

10.1046/j.1468-2982.2001.00137.x ◽

2001 ◽

Vol 21 (1) ◽

pp. 20-24 ◽

Cited By ~ 9

Author(s):

SJ Peroutka

Keyword(s):

Search Engines ◽

The Internet ◽

Web Pages ◽

Valuable Resource ◽

Internet Search ◽

Web Page ◽

Amount Of Information ◽

The Status ◽

Editorial Review

The Internet is capable of providing an unprecedented amount of information to both physicians and patients interested in headache. To assess the status of headache information on the Internet (as of January 2000), a search for ‘headache’ was performed using 10 leading Internet search engines. The number of web pages identified ranged from 4419 (WebCrawler) to 506 426 (Northern Light). The ‘average’ search yielded nearly 150 000 web page listings for ‘headache’. The content was then reviewed of the top 10 listed web pages for each search (i.e. a total of 100 page listings). The results demonstrate that, at the present time, Internet-based information on headache is extensive but poorly organized. Editorial review of this potential valuable resource is required in order to maximize its utility in headache education and management.

Download Full-text

Similarity Web Pages Retrieval Technologies on the Internet

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch440 ◽

2005 ◽

pp. 2486-2491

Author(s):

Rung Ching Chen ◽

Ming Yung Tsai ◽

Chung Hsun Hsieh

Keyword(s):

Information Retrieval ◽

Search Engine ◽

Search Engines ◽

Fast Growth ◽

The Other ◽

The Internet ◽

Web Pages ◽

Query Term ◽

Web Page ◽

Critical Problems

In recent years, due to the fast growth of the Internet, the services and information it provides are constantly expanding. Madria and Bhowmick (1999) and Baeza-Yates (2003) indicated that most large search engines need to comply to, on average, at least millions of hits daily in order to satisfy the users’ needs for information. Each search engine has its own sorting policy and the keyword format for the query term, but there are some critical problems. The searches may get more or less information. In the former, the user always gets buried in the information. Requiring only a little information, they always select some former items from the large amount of returned information. In the latter, the user always re-queries using another searching keyword to do searching work. The re-query operation also leads to retrieving information in a great amount, which leads to having a large amount of useless information. That is a bad cycle of information retrieval. The similarity Web page retrieval can help avoid browsing the useless information. The similarity Web page retrieval indicates a Web page, and then compares the page with the other Web pages from the searching results of search engines. The similarity Web page retrieval will allow users to save time by not browsing unrelated Web pages and reject non-similar Web pages, rank the similarity order of Web pages and cluster the similarity Web pages into the same classification.

Download Full-text

An Ontology-based Ranking Model in Search Engines

Journal of Computer Science Research ◽

10.30564/jcsr.v1i2.972 ◽

2019 ◽

Vol 1 (2) ◽

Author(s):

Yu Hou ◽

Lixin Tao

Keyword(s):

Search Engine ◽

Search Engines ◽

Semantic Search ◽

The Internet ◽

Web Pages ◽

Ranking Algorithm ◽

Intelligent Network ◽

Web Page ◽

Pagerank Algorithm ◽

Hits Algorithm

As the tsunami of data has emerged, search engines have become the most powerful tool for obtaining scattered information on the internet. The traditional search engines return the organized results by using ranking algorithm such as term frequency, link analysis (PageRank algorithm and HITS algorithm) etc. However, these algorithms must combine the keyword frequency to determine the relevance between user’s query and the data in the computer system or internet. Moreover, we expect the search engines could understand users’ searching by content meanings rather than literal strings. Semantic Web is an intelligent network and it could understand human’s language more semantically and make the communication easier between human and computers. But, the current technology for the semantic search is hard to apply. Because some meta data should be annotated to each web pages, then the search engine will have the ability to understand the users intend. However, annotate every web page is very time-consuming and leads to inefficiency. So, this study designed an ontology-based approach to improve the current traditional keyword-based search and emulate the effects of semantic search. And let the search engine can understand users more semantically when it gets the knowledge.

Download Full-text

IMPLEMENTASI ALGORITMA GOOGLE LATENT SEMANTIC DISTANCE UNTUK EKSTRAKSI RANGKAIAN KATA KUNCI ARTIKEL JURNAL ILMIAH

Computatio : Journal of Computer Science and Information Systems ◽

10.24912/computatio.v2i2.2569 ◽

2018 ◽

Vol 2 (2) ◽

pp. 186

Author(s):

Novario Jaya Perdana

Keyword(s):

Search Engine ◽

Search Engines ◽

Semantic Distance ◽

Relevant Information ◽

High Accuracy ◽

Hard Work ◽

The Internet ◽

Search Results ◽

Search Result

The accuracy of search result using search engine depends on the keywords that are used. Lack of the information provided on the keywords can lead to reduced accuracy of the search result. This means searching information on the internet is a hard work. In this research, a software has been built to create document keywords sequences. The software uses Google Latent Semantic Distance which can extract relevant information from the document. The information is expressed in the form of specific words sequences which could be used as keyword recommendations in search engines. The result shows that the implementation of the method for creating document keyword recommendation achieved high accuracy and could finds the most relevant information in the top search results.

Download Full-text

Digital hajj: the pilgrimage to Mecca in Muslim cyberspace and the issue of religious online authority

Scripta Instituti Donneriani Aboensis ◽

10.30674/scripta.67440 ◽

2013 ◽

Vol 25 ◽

pp. 189-203 ◽

Cited By ~ 1

Author(s):

Dominik Schlosser

Keyword(s):

Search Engine ◽

The Internet ◽

Web Pages ◽

Religious Authority ◽

Liminal Space ◽

Online Presence ◽

Optimisation Techniques

This paper attempts to give an overview of the different representations of the pilgrimage to Mecca found in the ‘liminal space’ of the internet. For that purpose, it examines a handful of emblematic examples of how the hajj is being presented and discussed in cyberspace. Thereby, special attention shall be paid to the question of how far issues of religious authority are manifest on these websites, whether the content providers of web pages appoint themselves as authorities by scrutinizing established views of the fifth pillar of Islam, or if they upload already printed texts onto their sites in order to reiterate normative notions of the pilgrimage to Mecca, or of they make use of search engine optimisation techniques, thus heightening the very visibility of their online presence and increasing the possibility of becoming authoritative in shaping internet surfers’ perceptions of the hajj.

Download Full-text

A Study on the Detection Method for Malicious URLs Based on a Number of Search Results Matching the Internet Search Engines Combining the Machine Learning

Journal of Electrical Engineering and Technology ◽

10.1007/s42835-021-00888-1 ◽

2021 ◽

Author(s):

Minhae Jang ◽

Jaeju Song ◽

Myongsoo Kim

Keyword(s):

Machine Learning ◽

Search Engines ◽

Detection Method ◽

The Internet ◽

Internet Search ◽

Search Results

Download Full-text