Question Answering in Restricted Domains: An Overview

2007 ◽  
Vol 33 (1) ◽  
pp. 41-61 ◽  
Author(s):  
Diego Mollá ◽  
José Luis Vicedo

Automated question answering has been a topic of research and development since the earliest AI applications. Computing power has increased since the first such systems were developed, and the general methodology has changed from the use of hand-encoded knowledge bases about simple domains to the use of text collections as the main knowledge source over more complex domains. Still, many research issues remain. The focus of this article is on the use of restricted domains for automated question answering. The article contains a historical perspective on question answering over restricted domains and an overview of the current methods and applications used in restricted domains. A main characteristic of question answering in restricted domains is the integration of domain-specific information that is either developed for question answering or that has been developed for other purposes. We explore the main methods developed to leverage this domain-specific information.

2020 ◽  
Vol 34 (05) ◽  
pp. 9370-9377
Author(s):  
Zihan Xu ◽  
Hai-Tao Zheng ◽  
Shaopeng Zhai ◽  
Dong Wang

Semantic matching is a basic problem in natural language processing, but it is far from solved because of the differences between the pairs for matching. In question answering (QA), answer selection (AS) is a popular semantic matching task, usually reformulated as a paraphrase identification (PI) problem. However, QA is different from PI because the question and the answer are not synonymous sentences and not strictly comparable. In this work, a novel knowledge and cross-pair pattern guided semantic matching system (KCG) is proposed, which considers both knowledge and pattern conditions for QA. We apply explicit cross-pair matching based on Graph Convolutional Network (GCN) to help KCG recognize general domain-independent Q-to-A patterns better. And with the incorporation of domain-specific information from knowledge bases (KB), KCG is able to capture and explore various relations within Q-A pairs. Experiments show that KCG is robust against the diversity of Q-A pairs and outperforms the state-of-the-art systems on different answer selection tasks.


Author(s):  
John Kontos ◽  
Ioanna Malagardi

Question Answering (QA) is one of the branches of Artificial Intelligence (AI) that involves the processing of human language by computer. QA systems accept questions in natural language and generate answers often in natural language. The answers are derived from databases, text collections, and knowledge bases. The main aim of QA systems is to generate a short answer to a question rather than a list of possibly relevant documents. As it becomes more and more difficult to find answers on the World Wide Web (WWW) using standard search engines, the technology of QA systems will become increasingly important. A series of systems that can answer questions from various data or knowledge sources are briefly described. These systems provide a friendly interface to the user of information systems that is particularly important for users who are not computer experts. The line of development of ideas starts with procedural semantics and leads to interfaces that support researchers for the discovery of parameter values of causal models of systems under scientific study. QA systems historically developed roughly during the 1960-1970 decade (Simmons, 1970). A few of the QA systems that were implemented during this decade are: • The BASEBALL system (Green et al., 1961) • The FACT RETRIEVAL System (Cooper, 1964) • The DELFI systems (Kontos & Kossidas, 1971; Kontos & Papakontantinou, 1970)


2015 ◽  
Vol 24 (02) ◽  
pp. 1540012 ◽  
Author(s):  
Pavlos Fafalios ◽  
Manolis Baritakis ◽  
Yannis Tzitzikas

Named Entity Extraction (NEE) is the process of identifying entities in texts and, very commonly, linking them to related (Web) resources. This task is useful in several applications, e.g. for question answering, annotating documents, post-processing of search results, etc. However, existing NEE tools lack an open or easy configuration although this is very important for building domain-specific applications. For example, supporting a new category of entities, or specifying how to link the detected entities with online resources, is either impossible or very laborious. In this paper, we show how we can exploit semantic information (Linked Data) at real-time for configuring (handily) a NEE system and we propose a generic model for configuring such services. To explicitly define the semantics of the proposed model, we introduce an RDF/S vocabulary, called “Open NEE Configuration Model”, which allows a NEE service to describe (and publish as Linked Data) its entity mining capabilities, but also to be dynamically configured. To allow relating the output of a NEE process with an applied configuration, we propose an extension of the Open Annotation Data Model which also enables an application to run advanced queries over the annotated data. As a proof of concept, we present X-Link, a fully-configurable NEE framework that realizes this approach. Contrary to the existing tools, X-Link allows the user to easily define the categories of entities that are interesting for the application at hand by exploiting one or more semantic Knowledge Bases. The user is also able to update a category and specify how to semantically link and enrich the identified entities. This enhanced configurability allows X-Link to be easily configured for different contexts for building domain-specific applications. To test the approach, we conducted a task-based evaluation with users that demonstrates its usability, and a case study that demonstrates its feasibility.


Author(s):  
John Kontos ◽  
Ioanna Malagardi

Question Answering (QA) is one of the branches of Artificial Intelligence (AI) that involves the processing of human language by computer. QA systems accept questions in natural language and generate answers often in natural language. The answers are derived from databases, text collections, and knowledge bases. The main aim of QA systems is to generate a short answer to a question rather than a list of possibly relevant documents. As it becomes more and more difficult to find answers on the World Wide Web (WWW) using standard search engines, the technology of QA systems will become increasingly important. A series of systems that can answer questions from various data or knowledge sources are briefly described. These systems provide a friendly interface to the user of information systems that is particularly important for users who are not computer experts. The line of development of ideas starts with procedural semantics and leads to interfaces that support researchers for the discovery of parameter values of causal models of systems under scientific study. QA systems historically developed roughly during the 1960-1970 decade (Simmons, 1970). A few of the QA systems that were implemented during this decade are: • The BASEBALL system (Green et al., 1961) • The FACT RETRIEVAL System (Cooper, 1964) • The DELFI systems (Kontos & Kossidas, 1971; Kontos & Papakontantinou, 1970)


Author(s):  
Yufei Li ◽  
Xiaoyong Ma ◽  
Xiangyu Zhou ◽  
Pengzhen Cheng ◽  
Kai He ◽  
...  

Abstract Motivation Bio-entity Coreference Resolution focuses on identifying the coreferential links in biomedical texts, which is crucial to complete bio-events’ attributes and interconnect events into bio-networks. Previously, as one of the most powerful tools, deep neural network-based general domain systems are applied to the biomedical domain with domain-specific information integration. However, such methods may raise much noise due to its insufficiency of combining context and complex domain-specific information. Results In this paper, we explore how to leverage the external knowledge base in a fine-grained way to better resolve coreference by introducing a knowledge-enhanced Long Short Term Memory network (LSTM), which is more flexible to encode the knowledge information inside the LSTM. Moreover, we further propose a knowledge attention module to extract informative knowledge effectively based on contexts. The experimental results on the BioNLP and CRAFT datasets achieve state-of-the-art performance, with a gain of 7.5 F1 on BioNLP and 10.6 F1 on CRAFT. Additional experiments also demonstrate superior performance on the cross-sentence coreferences. Supplementary information Supplementary data are available at Bioinformatics online.


2004 ◽  
Vol 02 (01) ◽  
pp. 215-239 ◽  
Author(s):  
TOLGA CAN ◽  
YUAN-FANG WANG

We present a new method for conducting protein structure similarity searches, which improves on the efficiency of some existing techniques. Our method is grounded in the theory of differential geometry on 3D space curve matching. We generate shape signatures for proteins that are invariant, localized, robust, compact, and biologically meaningful. The invariancy of the shape signatures allows us to improve similarity searching efficiency by adopting a hierarchical coarse-to-fine strategy. We index the shape signatures using an efficient hashing-based technique. With the help of this technique we screen out unlikely candidates and perform detailed pairwise alignments only for a small number of candidates that survive the screening process. Contrary to other hashing based techniques, our technique employs domain specific information (not just geometric information) in constructing the hash key, and hence, is more tuned to the domain of biology. Furthermore, the invariancy, localization, and compactness of the shape signatures allow us to utilize a well-known local sequence alignment algorithm for aligning two protein structures. One measure of the efficacy of the proposed technique is that we were able to perform structure alignment queries 36 times faster (on the average) than a well-known method while keeping the quality of the query results at an approximately similar level.


Author(s):  
Uga Sproģis ◽  
Matīss Rikters

We present the Latvian Twitter Eater Corpus - a set of tweets in the narrow domain related to food, drinks, eating and drinking. The corpus has been collected over time-span of over 8 years and includes over 2 million tweets entailed with additional useful data. We also separate two sub-corpora of question and answer tweets and sentiment annotated tweets. We analyse the contents of the corpus and demonstrate use-cases for the sub-corpora by training domain-specific question-answering and sentiment-analysis models using the data from the corpus.


Sign in / Sign up

Export Citation Format

Share Document