Passage Retrieval vs. Document Retrieval in the CLEF 2006 Ad Hoc Monolingual Tasks with the IR-n System

Utilizing passage-based language models for ad hoc document retrieval

Information Retrieval ◽

10.1007/s10791-009-9118-8 ◽

2009 ◽

Vol 13 (2) ◽

pp. 157-187 ◽

Cited By ~ 6

Author(s):

Michael Bendersky ◽

Oren Kurland

Keyword(s):

Ad Hoc ◽

Document Retrieval ◽

Language Models

Download Full-text

A new Passage Retrieval Method in Arabic Question Answering Systems

10.21203/rs.3.rs-119562/v1 ◽

2020 ◽

Author(s):

Lana Alsabbagh ◽

Oumayma AlDakkak ◽

Nada Ghneim

Keyword(s):

Query Expansion ◽

Question Answering ◽

Document Retrieval ◽

Open Domain ◽

Retrieval Method ◽

Passage Retrieval ◽

Question Analysis ◽

The Core ◽

Question Answering Systems ◽

Retrieval Phase

Abstract In this paper, we present our approach to improve the performance of open-domain Arabic Question Answering systems. We focus on the passage retrieval phase which aims to retrieve the most related passages to the correct answer. To extract passages that are related to the question, the system passes through three phases: Question Analysis, Document Retrieval and Passage Retrieval. We define the passage as the sentence that ends with a dot ".". In the Question Processing phase, we applied the traditional NLP steps of tokenization, stopwords and unrelated symbols removal, and replacing the question words with their stems. We also applied Query Expansion by adding synonyms to the question words. In the Document Retrieval phase, we used the Vector Space Model (VSM) with TF-IDF vectorizer and cosine similarity. For the Passage Retrieval phase, which is the core of our system, we measured the similarity between passages and the question by a combination of the BM25 ranker and Word Embedding approach. We tested our system on ACRD dataset, which contains 1395 questions in different domains, and the system was able to achieve correct results with a precision of 92.2% and recall of 79.9% in finding the top-3 related passages for the query.

Download Full-text

Retrieving Relevant Passages Using N-grams for Open-Domain Question Answering

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213019500210 ◽

2019 ◽

Vol 28 (07) ◽

pp. 1950021

Author(s):

Rim Faiz ◽

Nouha Othman

Keyword(s):

Language Processing ◽

Question Answering ◽

Ad Hoc ◽

Direct Access ◽

Digital Information ◽

Open Domain ◽

Passage Retrieval ◽

The Subject ◽

Measure Of Similarity ◽

N Gram

Question Answering is most likely one of the toughest tasks in the field of Natural Language Processing. It aims at directly returning accurate and short answers to questions asked by users in human language over a huge collection of documents or database. Recently, the continuously exponential rise of digital information has imposed the need for more direct access to relevant answers. Thus, question answering has been the subject of a widespread attention and has been extensively explored over the last few years. Retrieving passages remains a crucial but also a challenging task in question answering. Although there has been an abundance of work on this task, this latter still implies non-trivial endeavor. In this paper, we propose an ad-hoc passage retrieval approach for Question Answering using n-grams. This approach relies on a new measure of similarity between a passage and a question for the extraction and ranking of the different passages based on n-gram overlapping. More concretely, our measure is based on the dependency degree of n-gram words of the question in the passage. We validate our approach by the development of the “SysPex” system that automatically returns the most relevant passages to a given question.

Download Full-text

Passage retrieval vs. document retrieval for factoid question answering

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval - SIGIR '03 ◽

10.1145/860435.860534 ◽

2003 ◽

Cited By ~ 4

Author(s):

Charles L. A. Clarke ◽

Egidio L. Terra

Keyword(s):

Question Answering ◽

Document Retrieval ◽

Passage Retrieval

Download Full-text

A Neural Passage Model for Ad-hoc Document Retrieval

Lecture Notes in Computer Science - Advances in Information Retrieval ◽

10.1007/978-3-319-76941-7_41 ◽

2018 ◽

pp. 537-543 ◽

Cited By ~ 2

Author(s):

Qingyao Ai ◽

Brendan O’Connor ◽

W. Bruce Croft

Keyword(s):

Ad Hoc ◽

Document Retrieval

Download Full-text

Ad-hoc Document Retrieval using Weak-Supervision with BERT and GPT2

10.18653/v1/2020.emnlp-main.343 ◽

2020 ◽

Author(s):

Yosi Mass ◽

Haggai Roitman

Keyword(s):

Ad Hoc ◽

Document Retrieval ◽

Weak Supervision

Download Full-text

Wikipedia Ad Hoc Passage Retrieval and Wikipedia Document Linking

Focused Access to XML Documents - Lecture Notes in Computer Science ◽

10.1007/978-3-540-85902-4_36 ◽

2008 ◽

pp. 426-439 ◽

Cited By ~ 2

Author(s):

Dylan Jenkinson ◽

Andrew Trotman

Keyword(s):

Ad Hoc ◽

Passage Retrieval

Download Full-text

Attentive Neural Architecture for Ad-hoc Structured Document Retrieval

Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM '18 ◽

10.1145/3269206.3271801 ◽

2018 ◽

Cited By ~ 2

Author(s):

Saeid Balaneshinkordan ◽

Alexander Kotov ◽

Fedor Nikolaev

Keyword(s):

Ad Hoc ◽

Document Retrieval ◽

Neural Architecture ◽

Structured Document ◽

Structured Document Retrieval

Download Full-text

Passage Retrieval Based on Density Distributions of Terms and Its Applications to Document Retrieval and Question Answering

Reading and Learning - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24642-8_17 ◽

2004 ◽

pp. 306-327 ◽

Cited By ~ 7

Author(s):

Koichi Kise ◽

Markus Junker ◽

Andreas Dengel ◽

Keinosuke Matsumoto

Keyword(s):

Question Answering ◽

Document Retrieval ◽

Passage Retrieval ◽

Density Distributions

Download Full-text

A Game Theoretic Analysis of the Adversarial Retrieval Setting

Journal of Artificial Intelligence Research ◽

10.1613/jair.5547 ◽

2017 ◽

Vol 60 ◽

pp. 1127-1164 ◽

Cited By ~ 3

Author(s):

Ran Ben Basat ◽

Moshe Tennenholtz ◽

Oren Kurland

Keyword(s):

Ad Hoc ◽

Document Retrieval ◽

Theoretic Analysis ◽

Information Need ◽

Document Ranking ◽

Game Theoretic Analysis ◽

Different Types ◽

Ad Hoc Retrieval ◽

Game Theoretic ◽

Do So

The main goal of search engines is ad hoc retrieval: ranking documents in a corpus by their relevance to the information need expressed by a query. The Probability Ranking Principle (PRP) --- ranking the documents by their relevance probabilities --- is the theoretical foundation of most existing ad hoc document retrieval methods. A key observation that motivates our work is that the PRP does not account for potential post-ranking effects; specifically, changes to documents that result from a given ranking. Yet, in adversarial retrieval settings such as the Web, authors may consistently try to promote their documents in rankings by changing them. We prove that, indeed, the PRP can be sub-optimal in adversarial retrieval settings. We do so by presenting a novel game theoretic analysis of the adversarial setting. The analysis is performed for different types of documents (single-topic and multi-topic) and is based on different assumptions about the writing qualities of documents' authors. We show that in some cases, introducing randomization into the document ranking function yields an overall user utility that transcends that of applying the PRP.

Download Full-text