Human-Centred Web Search

Data Mining ◽  
2013 ◽  
pp. 1852-1872
Author(s):  
Orland Hoeber

People commonly experience difficulties when searching the Web, arising from an incomplete knowledge regarding their information needs, an inability to formulate accurate queries, and a low tolerance for considering the relevance of the search results. While simple and easy to use interfaces have made Web search universally accessible, they provide little assistance for people to overcome the difficulties they experience when their information needs are more complex than simple fact-verification. In human-centred Web search, the purpose of the search engine expands from a simple information retrieval engine to a decision support system. People are empowered to take an active role in the search process, with the search engine supporting them in developing a deeper understanding of their information needs, assisting them in crafting and refining their queries, and aiding them in evaluating and exploring the search results. In this chapter, recent research in this domain is outlined and discussed.

2012 ◽  
pp. 217-238 ◽  
Author(s):  
Orland Hoeber

People commonly experience difficulties when searching the Web, arising from an incomplete knowledge regarding their information needs, an inability to formulate accurate queries, and a low tolerance for considering the relevance of the search results. While simple and easy to use interfaces have made Web search universally accessible, they provide little assistance for people to overcome the difficulties they experience when their information needs are more complex than simple fact-verification. In human-centred Web search, the purpose of the search engine expands from a simple information retrieval engine to a decision support system. People are empowered to take an active role in the search process, with the search engine supporting them in developing a deeper understanding of their information needs, assisting them in crafting and refining their queries, and aiding them in evaluating and exploring the search results. In this chapter, recent research in this domain is outlined and discussed.


Author(s):  
R. Subhashini ◽  
V.Jawahar Senthil Kumar

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.


Author(s):  
Ji-Rong Wen

Web query log is a type of file keeping track of the activities of the users who are utilizing a search engine. Compared to traditional information retrieval setting in which documents are the only information source available, query logs are an additional information source in the Web search setting. Based on query logs, a set of Web mining techniques, such as log-based query clustering, log-based query expansion, collaborative filtering and personalized search, could be employed to improve the performance of Web search.


Author(s):  
Hengki Tamando Sihotang

Online information needs have evolved in the real direction. These needs include the latest information, government services, and commercial products. The research question is how to describe and optimize keyword research with the allintitle technique on the google search engine. The development method used in this research is the prototype method because it is considered able to be evaluated directly on the user. The system testing is done for 3 months by placing keywords on several websites on Google. The conclusion that can be taken is to use the allintitle technique, the search results for the web are easier to find. And this web-based allintitle technique can overcome the challenges of captcha verification from the Google search engine.   Keywords: Allintitle, Google's Search Engine, Keyword competition.


Author(s):  
Ji-Rong Wen

Web query log is a type of file keeping track of the activities of the users who are utilizing a search engine. Compared to traditional information retrieval setting in which documents are the only information source available, query logs are an additional information source in the Web search setting. Based on query logs, a set of Web mining techniques, such as log-based query clustering, log-based query expansion, collaborative filtering and personalized search, could be employed to improve the performance of Web search.


Author(s):  
Shanfeng Zhu ◽  
Xiaotie Deng ◽  
Qizhi Fang ◽  
Weimin Zhang

Web search engines are one of the most popular services to help users find useful information on the Web. Although many studies have been carried out to estimate the size and overlap of the general web search engines, it may not benefit the ordinary web searching users, since they care more about the overlap of the top N (N=10, 20 or 50) search results on concrete queries, but not the overlap of the total index database. In this study, we present experimental results on the comparison of the overlap of the top N (N=10, 20 or 50) search results from AlltheWeb, Google, AltaVista and WiseNut for the 58 most popular queries, as well as for the distance of the overlapped results. These 58 queries are chosen from WordTracker service, which records the most popular queries submitted to some famous metasearch engines, such as MetaCrawler and Dogpile. We divide these 58 queries into three categories for further investigation. Through in-depth study, we observe a number of interesting results: the overlap of the top N results retrieved by different search engines is very small; the search results of the queries in different categories behave in dramatically different ways; Google, on average, has the highest overlap among these four search engines; each search engine tends to adopt a different rank algorithm independently.


Author(s):  
Yasufumi Takama ◽  
Takuya Tezuka ◽  
Hiroki Shibata ◽  
Lieu-Hen Chen ◽  
◽  
...  

This paper estimates users’ search intents when using the context search engine (CSE) by analyzing submitted queries. Recently, due to the increase in the amount of information on the Web and the diversification of information needs, the gap between user’s information needs and a basic search function provided by existing web search engines becomes larger. As a solution to this problem, the CSE that limits its tasks to answer questions about temporal trends has been proposed. It provides three primitive search functions, which users can use in accordance with their purposes. Furthermore, if the system can estimate users’ search intents, it can provide more user-friendly services that contribute the improvement of search efficiency. Aiming at estimating users’ search intents only from submitted queries, this paper analyzes the characteristics of queries in terms of typical search intents when using CSE, and defines classification rules. To show the potential use of the estimated search intents, this paper introduces a learning to rank into CSE. Experimental results show that MAP (mean average precision) is improved by learning rank models separately for different search intents.


2017 ◽  
Vol 26 (06) ◽  
pp. 1730002 ◽  
Author(s):  
T. Dhiliphan Rajkumar ◽  
S. P. Raja ◽  
A. Suruliandi

Short and ambiguous queries are the major problems in search engines which lead to irrelevant information retrieval for the users’ input. The increasing nature of the information on the web also makes various difficulties for the search engine to provide the users needed results. The web search engine experience the ill effects of ambiguity, since the queries are looked at on a rational level rather than the semantic level. In this paper, for improving the performance of search engine as of the users’ interest, personalization is based on the users’ clicks and bookmarking is proposed. Modified agglomerative clustering is used in this work for clustering the results. The experimental results prove that the proposed work scores better precision, recall and F-score.


2016 ◽  
Vol 43 (3) ◽  
pp. 316-327 ◽  
Author(s):  
Mohammad Sadeghi ◽  
Jesús Vegas

The performance evaluation of an information retrieval system is a decisive aspect of the measure of the improvements in search technology. The Google search engine, as a tool for retrieving information on the Web, is used by almost 92% of Iranian users. The purpose of this paper is to study Google’s performance in retrieving relevant information from Persian documents. The information retrieval effectiveness is based on the precision measures of the search results done to a website that we have built with the documents of a TREC standard corpus. We asked Google for 100 topics available on the corpus and we compared the retrieved webpages with the relevant documents. The obtained results indicated that the morphological analysis of the Persian language is not fully taken into account by the Google search engine. The incorrect text tokenisation, considering the stop words as the content keywords of a document and the wrong ‘variants encountered’ of words found by Google are the main reasons that affect the relevance of the Persian information retrieval on the Web for this search engine.


2012 ◽  
Vol 532-533 ◽  
pp. 1282-1286
Author(s):  
Zhi Chao Lin ◽  
Lei Sun ◽  
Xiao Liu

There is a lot of information contained in the World Wide Web. It has become a research focus to obtain the required related resources quickly and accurately from the web through the content-based search engines. Most current tools of full text web search engine, such as Lucene which is a widely used open source retrieval library in information retrieval field, are purely keyword based. This may not sufficient for users to retrieve in the web. In this paper, we employ a method to overcome the limitations of current full text search engines in represent of Lucene. We propose a Query Expansion and Information Retrieval approach which can help users to acquire more accurate contents from the web. The Query Expansion component finds expanded candidate words of the query word through WordNet which contains synonyms in several different senses; In the Information Retrieval component, the query word and its candidate words are used together as the input of the search module to get the result items. Furthermore, we can put the result items into different classes based on the expansion. Some experiments and the results are described in the late part of this paper.


Sign in / Sign up

Export Citation Format

Share Document