Enhanced Utility in Search Log Publishing of Infrequent Items

Author(s):  
S. Belinsha ◽  
A.P.V. Raghavendra

Search engines are being widely used by the web users. The search engine companies are concerned to produce best search results. Search logs are the records which records the interactions between the user and the search engine. Various search patterns, user's behaviors can be analyzed from these logs, which will help to enhance the search results. Publishing these search logs to third party for analysis is a privacy issue. Zealous algorithm of filtering the frequent search items in the search log looses its utility in the course of providing privacy. The proposed confess algorithm extends the work by qualifying the infrequent search items in the log which tends to increase the utility of the search log by preserving the privacy. Confess algorithm involves qualifying the infrequent keywords, URL clicks in the search log and publishing it along with the frequent items.

Author(s):  
Pratik C. Jambhale

Search engine optimization is a technique to take a web document in top search results of a search engine. Web presence Companies is not only an easy way to reach among the target users but it may be profitable for Business is exactly find the target users as of the reason that most of the time user search out with the keywords of their use rather than searching the Company name, and if the Company Website page come in the top positions then the page may be profitable. This work describes the tweaks of taking the page on top position in Google by increasing the Page rank which may result in the improved visibility and profitable deal for a Business. Google is most user-friendly search engine to prove for the all users which give user-oriented results. In addition ,most of other search engines use Google search patterns so we have concentrated on it. So, if a page is Register on Google it Is Display on most of the search engines.


Author(s):  
Shanfeng Zhu ◽  
Xiaotie Deng ◽  
Qizhi Fang ◽  
Weimin Zhang

Web search engines are one of the most popular services to help users find useful information on the Web. Although many studies have been carried out to estimate the size and overlap of the general web search engines, it may not benefit the ordinary web searching users, since they care more about the overlap of the top N (N=10, 20 or 50) search results on concrete queries, but not the overlap of the total index database. In this study, we present experimental results on the comparison of the overlap of the top N (N=10, 20 or 50) search results from AlltheWeb, Google, AltaVista and WiseNut for the 58 most popular queries, as well as for the distance of the overlapped results. These 58 queries are chosen from WordTracker service, which records the most popular queries submitted to some famous metasearch engines, such as MetaCrawler and Dogpile. We divide these 58 queries into three categories for further investigation. Through in-depth study, we observe a number of interesting results: the overlap of the top N results retrieved by different search engines is very small; the search results of the queries in different categories behave in dramatically different ways; Google, on average, has the highest overlap among these four search engines; each search engine tends to adopt a different rank algorithm independently.


2017 ◽  
Vol 33 (06) ◽  
pp. 665-669 ◽  
Author(s):  
Amar Gupta ◽  
Michael Nissan ◽  
Michael Carron ◽  
Giancarlo Zuliani ◽  
Hani Rayess

AbstractThe Internet is the primary source of information for facial plastic surgery patients. Most patients only analyze information in the first 10 Web sites retrieved. The aim of this study was to determine factors critical for improving Web site traffic and search engine optimization. A Google search of “rhinoplasty” was performed in Michigan. The first 20 distinct Web sites originating from private sources were included. Private was defined as personal Web sites for private practice physicians. The Web sites were evaluated using SEOquake and WooRANK, publicly available programs that analyze Web sites. Factors examined included the presence of social media, the number of distinct pages on the Web site, the traffic to the Web site, use of keywords, such as rhinoplasty in the heading and meta description, average visit duration, traffic coming from search, bounce rate, and the number of advertisements. Readability and Web site quality were also analyzed using the DISCERN and Health on the Net Foundation code principles. The first 10 Web sites were compared with the latter 10 Web sites using Student's t-tests. The first 10 Web sites received a significantly lower portion of traffic from search engines than the second 10 Web sites. The first 10 Web sites also had significantly fewer tags of the keyword “nose” in the meta description of the Web site. The first 10 Web sites were significantly more reliable according to the DISCERN instrument, scoring an average of 2.42 compared with 2.05 for the second 10 Web sites (p = 0.029). Search engine optimization is critical for facial plastic surgeons as it improves online presence. This may potentially result in increased traffic and an increase in patient visits. However, Web sites that rely too heavily on search engines for traffic are less likely to be in the top 10 search results. Web site curators should maintain a wide focus for obtaining Web site traffic, possibly including advertising and publishing information in third party sources such as “RealSelf.”


2008 ◽  
pp. 1926-1937
Author(s):  
Shanfeng Chu ◽  
Xiaotie Deng ◽  
Qizhi Fang ◽  
Weimin Zhang

Web search engines are one of the most popular services to help users find useful information on the Web. Although many studies have been carried out to estimate the size and overlap of the general web search engines, it may not benefit the ordinary web searching users, since they care more about the overlap of the top N (N=10, 20 or 50) search results on concrete queries, but not the overlap of the total index database. In this study, we present experimental results on the comparison of the overlap of the top N (N=10, 20 or 50) search results from AlltheWeb, Google, AltaVista and WiseNut for the 58 most popular queries, as well as for the distance of the overlapped results. These 58 queries are chosen from WordTracker service, which records the most popular queries submitted to some famous metasearch engines, such as MetaCrawler and Dogpile. We divide these 58 queries into three categories for further investigation. Through in-depth study, we observe a number of interesting results: the overlap of the top N results retrieved by different search engines is very small; the search results of the queries in different categories behave in dramatically different ways; Google, on average, has the highest overlap among these four search engines; each search engine tends to adopt a different rank algorithm independently.


2021 ◽  
pp. 089443932110068
Author(s):  
Aleksandra Urman ◽  
Mykola Makhortykh ◽  
Roberto Ulloa

We examine how six search engines filter and rank information in relation to the queries on the U.S. 2020 presidential primary elections under the default—that is nonpersonalized—conditions. For that, we utilize an algorithmic auditing methodology that uses virtual agents to conduct large-scale analysis of algorithmic information curation in a controlled environment. Specifically, we look at the text search results for “us elections,” “donald trump,” “joe biden,” “bernie sanders” queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex, during the 2020 primaries. Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents using the same search engine. It highlights that whether users see certain information is decided by chance due to the inherent randomization of search results. We also find that some search engines prioritize different categories of information sources with respect to specific candidates. These observations demonstrate that algorithmic curation of political information can create information inequalities between the search engine users even under nonpersonalized conditions. Such inequalities are particularly troubling considering that search results are highly trusted by the public and can shift the opinions of undecided voters as demonstrated by previous research.


2001 ◽  
Vol 1 (3) ◽  
pp. 28-31 ◽  
Author(s):  
Valerie Stevenson

Looking back to 1999, there were a number of search engines which performed equally well. I recommended defining the search strategy very carefully, using Boolean logic and field search techniques, and always running the search in more than one search engine. Numerous articles and Web columns comparing the performance of different search engines came to different conclusions on the ‘best’ search engines. Over the last year, however, all the speakers at conferences and seminars I have attended have recommended Google as their preferred tool for locating all kinds of information on the Web. I confess that I have now abandoned most of my carefully worked out search strategies and comparison tests, and use Google for most of my own Web searches.


2019 ◽  
Vol 71 (1) ◽  
pp. 54-71 ◽  
Author(s):  
Artur Strzelecki

Purpose The purpose of this paper is to clarify how many removal requests are made, how often, and who makes these requests, as well as which websites are reported to search engines so they can be removed from the search results. Design/methodology/approach Undertakes a deep analysis of more than 3.2bn removed pages from Google’s search results requested by reporting organizations from 2011 to 2018 and over 460m removed pages from Bing’s search results requested by reporting organizations from 2015 to 2017. The paper focuses on pages that belong to the .pl country coded top-level domain (ccTLD). Findings Although the number of requests to remove data from search results has been growing year on year, fewer URLs have been reported in recent years. Some of the requests are, however, unjustified and are rejected by teams representing the search engines. In terms of reporting copyright violations, one company in particular stands out (AudioLock.Net), accounting for 28.1 percent of all reports sent to Google (the top ten companies combined were responsible for 61.3 percent of the total number of reports). Research limitations/implications As not every request can be published, the study is based only what is publicly available. Also, the data assigned to Poland is only based on the ccTLD domain name (.pl); other domain extensions for Polish internet users were not considered. Originality/value This is first global analysis of data from transparency reports published by search engine companies as prior research has been based on specific notices.


Author(s):  
Novario Jaya Perdana

The accuracy of search result using search engine depends on the keywords that are used. Lack of the information provided on the keywords can lead to reduced accuracy of the search result. This means searching information on the internet is a hard work. In this research, a software has been built to create document keywords sequences. The software uses Google Latent Semantic Distance which can extract relevant information from the document. The information is expressed in the form of specific words sequences which could be used as keyword recommendations in search engines. The result shows that the implementation of the method for creating document keyword recommendation achieved high accuracy and could finds the most relevant information in the top search results.


Author(s):  
R. Subhashini ◽  
V.Jawahar Senthil Kumar

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.


Author(s):  
Suely Fragoso

This chapter proposes that search engines apply a verticalizing pressure on the WWW many-to-many information distribution model, forcing this to revert to a distributive model similar to that of the mass media. The argument for this starts with a critical descriptive examination of the history of search mechanisms for the Internet. Parallel to this there is a discussion of the increasing ties between the search engines and the advertising market. The chapter then presents questions concerning the concentration of traffic on the Web around a small number of search engines which are in the hands of an equally limited number of enterprises. This reality is accentuated by the confidence that users place in the search engine and by the ongoing acquisition of collaborative systems and smaller players by the large search engines. This scenario demonstrates the verticalizing pressure that the search engines apply to the majority of WWW users, that bring it back toward the mass distribution mode.


Sign in / Sign up

Export Citation Format

Share Document