Improving Information Retrieval Precision by Finding Related Queries with Similar Information Need Using Information Scent

Author(s):  
Suruchi Chawla ◽  
Punam Bedi
Author(s):  
Suruchi Chawla

This chapter explains the multi-agent system for effective information retrieval using information scent in query log mining. The precision of search results is low due to difficult to infer the information need of the small size search query and therefore information need of the user is not satisfied effectively. Information Scent is used for modeling the information need of user web search session and clustering is performed to identify the similar information need sessions. Hyper Link-Induced Topic Search (HITS) is executed on clusters to generate the Hubs and authorities for web page recommendations to users who search with similar intents. This multi-agent system based on clustered query sessions uses query operations like expansion and recommendation to infer the information need of user search queries and recommends Hubs and authorities for effective web search.


Author(s):  
Suruchi Chawla

This chapter explains the multi-agent system for effective information retrieval using information scent in query log mining. The precision of search results is low due to difficult to infer the information need of the small size search query and therefore information need of the user is not satisfied effectively. Information Scent is used for modeling the information need of user web search session and clustering is performed to identify the similar information need sessions. Hyper Link-Induced Topic Search (HITS) is executed on clusters to generate the Hubs and authorities for web page recommendations to users who search with similar intents. This multi-agent system based on clustered query sessions uses query operations like expansion and recommendation to infer the information need of user search queries and recommends Hubs and authorities for effective web search.


2021 ◽  
Vol 55 (1) ◽  
pp. 1-2
Author(s):  
Bhaskar Mitra

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents---or short passages---in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms---such as a person's name or a product model number---not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections---such as the document index of a commercial Web search engine---containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks. We ground our contributions with a detailed survey of the growing body of neural IR literature [Mitra and Craswell, 2018]. Our key contribution towards improving the effectiveness of deep ranking models is developing the Duet principle [Mitra et al., 2017] which emphasizes the importance of incorporating evidence based on both patterns of exact term matches and similarities between learned latent representations of query and document. To efficiently retrieve from large collections, we develop a framework to incorporate query term independence [Mitra et al., 2019] into any arbitrary deep model that enables large-scale precomputation and the use of inverted index for fast retrieval. In the context of stochastic ranking, we further develop optimization strategies for exposure-based objectives [Diaz et al., 2020]. Finally, this dissertation also summarizes our contributions towards benchmarking neural IR models in the presence of large training datasets [Craswell et al., 2019] and explores the application of neural methods to other IR tasks, such as query auto-completion.


2021 ◽  
Vol 20 (4) ◽  
pp. 50-64
Author(s):  
Bissan Audeh ◽  
Michel Beigbeder ◽  
Christine Largeron ◽  
Diana Ramírez-Cifuentes

Digital libraries have become an essential tool for researchers in all scientific domains. With almost unlimited storage capacities, current digital libraries hold a tremendous number of documents. Though some efforts have been made to facilitate access to documents relevant to a specific information need, such a task remains a real challenge for a new researcher. Indeed neophytes do not necessarily use appropriate keywords to express their information need and they might not be qualified enough to evaluate correctly the relevance of documents retrieved by the system. In this study, we suppose that to better meet the needs of neophytes, the information retrieval system in a digital library should take into consideration features other than content-based relevance. To test this hypothesis, we use machine learning methods and build new features from several metadata related to documents. More precisely, we propose to consider as features for machine learning: content-based scores, scores based on the citation graph and scores based on metadata extracted from external resources. As acquiring such features is not a trivial task, we analyze their usefulness and their capacity to detect relevant documents. Our analysis concludes that the use of these additional features improves the performance of the system for a neophyte. In fact, by adding the new features we find more documents suitable for neophytes within the results returned by the system than when using content-based features alone.


Author(s):  
Qiaozhu Mei ◽  
Dragomir Radev

This chapter is a basic introduction to text information retrieval. Information Retrieval (IR) refers to the activities of obtaining information resources (usually in the form of textual documents) from a much larger collection, which are relevant to an information need of the user (usually expressed as a query). Practical instances of an IR system include digital libraries and Web search engines. This chapter presents the typical architecture of an IR system, an overview of the methods corresponding to the design and the implementation of each major component of an information retrieval system, a discussion of evaluation methods for an IR system, and finally a summary of recent developments and research trends in the field of information retrieval.


2017 ◽  
Vol 10 (2) ◽  
pp. 311-325
Author(s):  
Suruchi Chawla

The main challenge for effective web Information Retrieval(IR) is to infer the information need from user’s query and retrieve relevant documents. The precision of search results is low due to vague and imprecise user queries and hence could not retrieve sufficient relevant documents. Fuzzy set based query expansion deals with imprecise and vague queries for inferring user’s information need. Trust based web page recommendations retrieve search results according to the user’s information need. In this paper an algorithm is designed for Intelligent Information Retrieval using hybrid of Fuzzy set and Trust in web query session mining to perform Fuzzy query expansion for inferring user’s information need and trust is used for recommendation of web pages according to the user’s information need. Experiment was performed on the data set collected in domains Academics, Entertainment and Sports and search results confirm the improvement of precision.


Author(s):  
Kodai Tsukahara Et.al

Current information recommendation systems obtain users’ preferences from Web browsing histories and activities such as purchase of products, and efficiently provide the users with their preferable information. In such a case, however, the same or similar information is always recommended, which is called filter bubble and it decreases the users’ satisfaction to the systems. If information recommendation systems could provide users with something surprising and useful as output information, the user’s satisfaction to the systems would drastically increase. Therefore, “serendipity” is paid attention to in this research. In this paper, a new information recommendation system using a concept-based information retrieval is proposed to provide the users with serendipitous information. In this system, concepts which describe features or roles of items are input instead of the items themselves, and information which can meet the concepts are output as candidates of serendipitous information. The serendipitous information is extracted from the output information using the criteria which are the indexes of serendipity defined in this research. Through the evaluation experiment, it is revealed that the proposed system achieves the accuracy of 70% for the serendipitous information determination and the accuracy of 100% for the information retrieval, which are satisfactory for this research purpose.


Author(s):  
Iris Xie

The nature of information retrieval (IR) is interaction. However, the traditional IR model only focuses on the comparison between user input and system output. It does not illustrate the changeable interaction process (Saracevic, 1997). The human involvement of IR makes the process complicated and dynamic. Belkin (1993) further identified the two underlying assumptions of the traditional IR view: (1) The information need is static, and can be specified; and (2) there is only one form of information-seeking behavior. The limitations of the traditional IR model are becoming more evident. In the 1990s researchers started to develop interactive IR models. Among them, Ingwersen’s cognitive model (1992, 1996), Belkin’s episode model of interaction with texts (1996), and Saracevic’s stratified model (1996a, 1997) are the most cited ones.


Sign in / Sign up

Export Citation Format

Share Document