Overviewing the Knowledge of a Query Keyword by Clustering Viewpoints of Web Search Information Needs

Author(s):  
Ichiro Moriya ◽  
Yusuke Inoue ◽  
Takakazu Imada ◽  
Takehito Utsuro ◽  
Yasuhide Kawada ◽  
...  
2017 ◽  
pp. 030-050
Author(s):  
J.V. Rogushina ◽  

Problems associated with the improve ment of information retrieval for open environment are considered and the need for it’s semantization is grounded. Thecurrent state and prospects of development of semantic search engines that are focused on the Web information resources processing are analysed, the criteria for the classification of such systems are reviewed. In this analysis the significant attention is paid to the semantic search use of ontologies that contain knowledge about the subject area and the search users. The sources of ontological knowledge and methods of their processing for the improvement of the search procedures are considered. Examples of semantic search systems that use structured query languages (eg, SPARQL), lists of keywords and queries in natural language are proposed. Such criteria for the classification of semantic search engines like architecture, coupling, transparency, user context, modification requests, ontology structure, etc. are considered. Different ways of support of semantic and otology based modification of user queries that improve the completeness and accuracy of the search are analyzed. On base of analysis of the properties of existing semantic search engines in terms of these criteria, the areas for further improvement of these systems are selected: the development of metasearch systems, semantic modification of user requests, the determination of an user-acceptable transparency level of the search procedures, flexibility of domain knowledge management tools, increasing productivity and scalability. In addition, the development of means of semantic Web search needs in use of some external knowledge base which contains knowledge about the domain of user information needs, and in providing the users with the ability to independent selection of knowledge that is used in the search process. There is necessary to take into account the history of user interaction with the retrieval system and the search context for personalization of the query results and their ordering in accordance with the user information needs. All these aspects were taken into account in the design and implementation of semantic search engine "MAIPS" that is based on an ontological model of users and resources cooperation into the Web.


2020 ◽  
Vol 54 (1) ◽  
pp. 1-12
Author(s):  
Martin Potthast ◽  
Matthias Hagen ◽  
Benno Stein

No Web technology has undergone such an impressive evolution as Web search engines did and still do. Starting with the promise of "Bringing order to the Web" 1 by compiling information sources matching a query, retrieval technology has been evolving to a kind of "oracle machinery", being able to recommend a single source, and even to provide direct answers extracted from that source. Notwithstanding the remarkable progress made and the apparent user preferences for direct answers, this paradigm shift comes at a price which is higher than one might expect at first sight, affecting both users and search engine developers in their own way. We call this tradeoff "the dilemma of the direct answer"; it deserves an analysis which has to go beyond system-oriented aspects but scrutinize the way our society deals with both their information needs and means to information access. The paper in hand contributes to this analysis by putting the evolution of retrieval technology and the expectations at it in the context of information retrieval history. Moreover, we discuss the trade offs in information behavior and information system design that users and developers may face in the future.


Author(s):  
Adan Ortiz-Cordova ◽  
Bernard J. Jansen

In this research study, the authors investigate the association between external searching, which is searching on a web search engine, and internal searching, which is searching on a website. They classify 295,571 external – internal searches where each search is composed of a search engine query that is submitted to a web search engine and then one or more subsequent queries submitted to a commercial website by the same user. The authors examine 891,453 queries from all searches, of which 295,571 were external search queries and 595,882 were internal search queries. They algorithmically classify all queries into states, and then clustered the searching episodes into major searching configurations and identify the most commonly occurring search patterns for both external, internal, and external-to-internal searching episodes. The research implications of this study are that external sessions and internal sessions must be considered as part of a continuous search episode and that online businesses can leverage external search information to more effectively target potential consumers.


2018 ◽  
Vol 6 (3) ◽  
pp. 67-78
Author(s):  
Tian Nie ◽  
Yi Ding ◽  
Chen Zhao ◽  
Youchao Lin ◽  
Takehito Utsuro

The background of this article is the issue of how to overview the knowledge of a given query keyword. Especially, the authors focus on concerns of those who search for web pages with a given query keyword. The Web search information needs of a given query keyword is collected through search engine suggests. Given a query keyword, the authors collect up to around 1,000 suggests, while many of them are redundant. They classify redundant search engine suggests based on a topic model. However, one limitation of the topic model based classification of search engine suggests is that the granularity of the topics, i.e., the clusters of search engine suggests, is too coarse. In order to overcome the problem of the coarse-grained classification of search engine suggests, this article further applies the word embedding technique to the webpages used during the training of the topic model, in addition to the text data of the whole Japanese version of Wikipedia. Then, the authors examine the word embedding based similarity between search engines suggests and further classify search engine suggests within a single topic into finer-grained subtopics based on the similarity of word embeddings. Evaluation results prove that the proposed approach performs well in the task of subtopic classification of search engine suggests.


2014 ◽  
Vol 3 (3) ◽  
pp. e12 ◽  
Author(s):  
Cinzia Colombo ◽  
Paola Mosconi ◽  
Paolo Confalonieri ◽  
Isabella Baroni ◽  
Silvia Traversa ◽  
...  

2017 ◽  
Vol 13 (3) ◽  
pp. 37-56 ◽  
Author(s):  
Abdelkrim Bouramoul

Users of Web search engines are generally confronted to numerous responses that are rarely structured, making it difficult to analyze the available results. Indeed, the linear results displayed through lists ordered according to a relevance criterion, although still widely used, seem often limitless. A solution to this problem is to improve the interfaces for better visualization of large number of results. In this paper, we propose modeling and implementation of a tool for graphical visualization and manipulation of results returned by search engines. The goal is to facilitate the analysis, the interpretation and the supervision of users' information needs. The architecture of the ‘Gravisor' tool is based on Multi-Agent paradigm. It is composed of four agents working in full cooperation and coordination. We hope that besides the web information retrieval field, the three graphical visualization modes offered by the ‘Gravisor' tool will be a promising alternative for better information visualization in other areas.


Author(s):  
Nils Pharo

Several studies of Web information searching (Agosto, 2002, Pharo & Järvelin, 2006, Prabha et al. 2007) have pointed out that searchers tend to satisfice. This means that, instead of planning for optimal search outcomes based on the best available knowledge, and on choosing the best information sources for their purpose, they aim at obtaining satisfactory results with a minimum of effort. Thus it is necessary to study other factors than the information needs and sources to explain Web search behaviour. Web information search processes are influenced by the interplay of factors at the micro-level and we need to understand how search process related factors such as the actions performed by the searcher on the system are influenced by various factors, e.g. those related to the searcher’s work task, search task, knowledge about the work task or searching etc. The Search Situation Transition (SST) method schema provides a framework for such analysis.


Author(s):  
Jon Atle Gulla ◽  
Hans Olaf Borch ◽  
Jon Espen Ingvaldsen

Due to the large amount of information on the web and the difficulties of relating user’s expressed information needs to document content, large-scale web search engines tend to return thousands of ranked documents. This chapter discusses the use of clustering to help users navigate through the result sets and explore the domain. A newly developed system, HOBSearch, makes use of suffix tree clustering to overcome many of the weaknesses of traditional clustering approaches. Using result snippets rather than full documents, HOBSearch both speeds up clustering substantially and manages to tailor the clustering to the topics indicated in user’s query. An inherent problem with clustering, though, is the choice of cluster labels. Our experiments with HOBSearch show that cluster labels of an acceptable quality can be generated with no upervision or predefined structures and within the constraints given by large-scale web search.


Sign in / Sign up

Export Citation Format

Share Document