scholarly journals Why-type Question to Query Reformulation for efficient Document Retrieval

2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

Understanding the actual need of user from a question is very crucial in non-factoid why-question answering as Why-questions are complex and involve ambiguity and redundancy in their understanding. The precise requirement is to determine the focus of question and reformulate them accordingly to retrieve expected answers to a question. The paper analyzes different types of why-questions and proposes an algorithm for each class to determine the focus and reformulate it into a query by appending focal terms and cue phrase ‘because’ with it. Further, a user interface is implemented which asks input why-question, applies different components of question , reformulates it and finally retrieve web pages by posing query to Google search engine. To measure the accuracy of the process, user feedback is taken which asks them to assign scoring from 1 to 10, on how relevant are the retrieved web pages according to their understanding. The results depict that maximum precision of 89% is achieved in Informational type why-questions and minimum of 48% in opinionated type why-questions.

2014 ◽  
Vol 23 (04) ◽  
pp. 1460014 ◽  
Author(s):  
Georgios Stratogiannis ◽  
Georgios Siolas ◽  
Andreas Stafylopatis

We describe a system that performs semantic Question Answering based on the combination of classic Information Retrieval methods with semantic ones. First, we use a search engine to gather web pages and then apply a noun phrase extractor to extract all the candidate answer entities from them. Candidate entities are ranked using a linear combination of two IR measures to pick the most relevant ones. For each one of the top ranked candidate entities we find the corresponding Wikipedia page. We then propose a novel way to exploit Semantic Information contained in the structure of Wikipedia. A vector is built for every entity from Wikipedia category names by splitting and lemmatizing the words that form them. These vectors maintain Semantic Information in the sense that we are given the ability to measure semantic closeness between the entities. Based on this, we apply an intelligent clustering method to the candidate entities and show that candidate entities in the biggest cluster are the most semantically related to the ideal answers to the query. Results on the topics of the TREC 2009 Related Entity Finding task dataset show promising performance.


Author(s):  
Thomas Nicolai ◽  
Lars Kirchhof ◽  
Axel Bruns ◽  
Jason Wilson ◽  
Barry Saunders

This paper investigates self-Googling through the monitoring of search engine activities of users and adds to the few quantitative studies on this topic already in existence. We explore this phenomenon by answering the following questions: To what extent is the self-Googling visible in the usage of search engines; is any significant difference measurable between queries related to self-Googling and generic search queries; to what extent do self-Googling search requests match the selected personalised Web pages? To address these questions we explore the theory of narcissism in order to help define self-Googling and present the results from a 14-month online experiment using Google search engine usage data.


2014 ◽  
Vol 4 (2) ◽  
pp. 19-40
Author(s):  
Rosy Madaan ◽  
A.K. Sharma ◽  
Ashutosh Dixit

Question answering offers a more intuitive approach to information processing. A number of approaches have been used for answering questions. In this paper, we propose a questionansweringsystem that uses blogs as its source of information. The system deals with crawling blog pages, summarizing them, indexing and then ranking the summarized content. The user asks a question and gets answer(s) in response. The answer(s) obtained are better as compared to those provided by the existing QA systems that use the general web pages for the purpose of answering. The experimental results show that the proposed system has shown promising results and the responses given by the system are better than those given by the existing QA systems.


Author(s):  
CHARLES X. LING ◽  
JIANFENG GAO ◽  
HUAJIE ZHANG ◽  
WEINING QIAN ◽  
HONGJIANG ZHANG

We propose a data-mining approach that produces generalized query patterns (with generalized keywords) from the raw user logs of the Microsoft Encarta search engine (). Those query patterns can act as cache of the search engine, improving its performance. The cache of the generalized query patterns is more advantageous than the cache of the most frequent user queries since our patterns are generalized, covering more queries and future queries — even those not previously asked. Our method is unique since query patterns discovered reflect the actual dynamic usage and user feedbacks of the search engine, rather than the syntactic linkage structure of web pages (as Google does). Simulation shows that such generalized query patterns improve search engine's overall speed considerably. The generalized query patterns, when viewed with a graphical user interface, are also helpful to web editors, who can easily discover topics in which users are mostly interested.


2017 ◽  
Vol 2 (3) ◽  
pp. 170
Author(s):  
Faranak Salman Mohajer

In this paper, our purpose is to create a large collection of related vocabularies and concepts to the user’s favorite field (articles, people, conferences, books, etc.) from the available information on the infinite and vast source of web which is expressed in the form of social network. In the other words, we introduced a way to help the researchers to be able to specify their favorite topic in a particular field and by this way, observe and extract the social network of the related concepts to that topic. In order to extract the nodes of this network, we used the sampling of web pages through the Google search engine, text processing techniques, and information retrieval. The topic of the extracted social network in this research is the scientific conferences in the field of computer sciences. In order to evaluate the effectiveness of this method, the extracted network from the results of the search engine is compared with the scientific conferences available in the DBLP[1] database. The obtained results from the social network analysis showed that the extracted network is of very high accuracy.[1] Digital Bibliography and Library Project


Author(s):  
Marc L. Resnick ◽  
Jennifer Bandos

The Internet has become a powerful tool for information search and ecommerce. Millions of people use the World Wide Web on a regular basis and the number is increasing rapidly. For many common tasks, users first need to locate a Web site(s) containing needed information from among the estimated 4 trillion existing web pages. The most common method used to search for information is the search engine. However, even sophisticated users often have difficulty navigating through the complexity of search engine interfaces. Designing more effective and efficient search engines is contingent upon a significant improvement in the search user interface.


2003 ◽  
Vol 8 (4) ◽  
pp. 897-902 ◽  
Author(s):  
Carlos Eduardo Siqueira ◽  
Fernando Carvalho

This article reviews the scope of several Observatories found by a search in the Internet through the Google search engine. After examining these observatories, it describes the aims and initial accomplishments of the Observatory of the Americas as a network of professionals and activists from different countries in the Americas. The article concludes with a discussion of the pattern identified among these observatories: they may be clearinghouses or networks, or both.


2013 ◽  
Vol 25 ◽  
pp. 189-203 ◽  
Author(s):  
Dominik Schlosser

This paper attempts to give an overview of the different representations of the pilgrimage to Mecca found in the ‘liminal space’ of the internet. For that purpose, it examines a handful of emblematic examples of how the hajj is being presented and discussed in cyberspace. Thereby, special attention shall be paid to the question of how far issues of religious authority are manifest on these websites, whether the content providers of web pages appoint themselves as authorities by scrutinizing established views of the fifth pillar of Islam, or if they upload already printed texts onto their sites in order to reiterate normative notions of the pilgrimage to Mecca, or of they make use of search engine optimisation techniques, thus heightening the very visibility of their online presence and increasing the possibility of becoming authoritative in shaping internet surfers’ perceptions of the hajj.


Sign in / Sign up

Export Citation Format

Share Document