Differentiable Reasoning on Large Knowledge Bases and Natural Language

Pasquale Minervini; Matko Bošnjak; Tim Rocktäschel; Sebastian Riedel; Edward Grefenstette

doi:10.1609/aaai.v34i04.5962

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5962 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5182-5190

Author(s):

Pasquale Minervini ◽

Matko Bošnjak ◽

Tim Rocktäschel ◽

Sebastian Riedel ◽

Edward Grefenstette

Keyword(s):

Natural Language ◽

Link Prediction ◽

Question Answering ◽

Knowledge Bases ◽

Small Scale ◽

Reasoning Systems ◽

Novel Approach ◽

Real World Datasets ◽

Interpretable Models ◽

Machine Reading

Reasoning with knowledge expressed in natural language and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering. General neural architectures that jointly learn representations and transformations of text are very data-inefficient, and it is hard to analyse their reasoning process. These issues are addressed by end-to-end differentiable reasoning systems such as Neural Theorem Provers (NTPs), although they can only be used with small-scale symbolic KBs. In this paper we first propose Greedy NTPs (GNTPs), an extension to NTPs addressing their complexity and scalability limitations, thus making them applicable to real-world datasets. This result is achieved by dynamically constructing the computation graph of NTPs and including only the most promising proof paths during inference, thus obtaining orders of magnitude more efficient models 1. Then, we propose a novel approach for jointly reasoning over KBs and textual mentions, by embedding logic facts and natural language sentences in a shared embedding space. We show that GNTPs perform on par with NTPs at a fraction of their cost while achieving competitive link prediction results on large datasets, providing explanations for predictions, and inducing interpretable models.

Download Full-text

Data mining for building knowledge bases: techniques, architectures and applications

The Knowledge Engineering Review ◽

10.1017/s0269888916000047 ◽

2016 ◽

Vol 31 (2) ◽

pp. 97-123 ◽

Cited By ~ 4

Author(s):

Alfred Krzywicki ◽

Wayne Wobcke ◽

Michael Bain ◽

John Calvo Martinez ◽

Paul Compton

Keyword(s):

Data Mining ◽

Knowledge Base ◽

Question Answering ◽

Knowledge Bases ◽

Event Extraction ◽

Data Sources ◽

Small Scale ◽

Knowledge Mining ◽

Practical Applications ◽

Unstructured Text

AbstractData mining techniques for extracting knowledge from text have been applied extensively to applications including question answering, document summarisation, event extraction and trend monitoring. However, current methods have mainly been tested on small-scale customised data sets for specific purposes. The availability of large volumes of data and high-velocity data streams (such as social media feeds) motivates the need to automatically extract knowledge from such data sources and to generalise existing approaches to more practical applications. Recently, several architectures have been proposed for what we callknowledge mining: integrating data mining for knowledge extraction from unstructured text (possibly making use of a knowledge base), and at the same time, consistently incorporating this new information into the knowledge base. After describing a number of existing knowledge mining systems, we review the state-of-the-art literature on both current text mining methods (emphasising stream mining) and techniques for the construction and maintenance of knowledge bases. In particular, we focus on mining entities and relations from unstructured text data sources, entity disambiguation, entity linking and question answering. We conclude by highlighting general trends in knowledge mining research and identifying problems that require further research to enable more extensive use of knowledge bases.

Download Full-text

Computational construction grammar for visual question answering

Linguistics Vanguard ◽

10.1515/lingvan-2018-0070 ◽

2019 ◽

Vol 5 (1) ◽

Author(s):

Jens Nevens ◽

Paul Van Eecke ◽

Katrien Beuls

Keyword(s):

Natural Language ◽

Question Answering ◽

Semantic Representation ◽

Construction Grammar ◽

Training Data ◽

Knowledge Sources ◽

Visual Question Answering ◽

Novel Approach ◽

Natural Language Question ◽

Grammar Model

AbstractIn order to be able to answer a natural language question, a computational system needs three main capabilities. First, the system needs to be able to analyze the question into a structured query, revealing its component parts and how these are combined. Second, it needs to have access to relevant knowledge sources, such as databases, texts or images. Third, it needs to be able to execute the query on these knowledge sources. This paper focuses on the first capability, presenting a novel approach to semantically parsing questions expressed in natural language. The method makes use of a computational construction grammar model for mapping questions onto their executable semantic representations. We demonstrate and evaluate the methodology on the CLEVR visual question answering benchmark task. Our system achieves a 100% accuracy, effectively solving the language understanding part of the benchmark task. Additionally, we demonstrate how this solution can be embedded in a full visual question answering system, in which a question is answered by executing its semantic representation on an image. The main advantages of the approach include (i) its transparent and interpretable properties, (ii) its extensibility, and (iii) the fact that the method does not rely on any annotated training data.

Download Full-text

Introducing External Knowledge to Answer Questions with Implicit Temporal Constraints over Knowledge Base

Future Internet ◽

10.3390/fi12030045 ◽

2020 ◽

Vol 12 (3) ◽

pp. 45

Author(s):

Wenqing Wu ◽

Zhenfang Zhu ◽

Qiang Lu ◽

Dianyuan Zhang ◽

Qiangqiang Guo

Keyword(s):

Natural Language ◽

Knowledge Base ◽

Question Answering ◽

Knowledge Bases ◽

Temporal Information ◽

Temporal Constraints ◽

External Knowledge ◽

Question Answering Systems ◽

Natural Language Question ◽

Applied Knowledge

Knowledge base question answering (KBQA) aims to analyze the semantics of natural language questions and return accurate answers from the knowledge base (KB). More and more studies have applied knowledge bases to question answering systems, and when using a KB to answer a natural language question, there are some words that imply the tense (e.g., original and previous) and play a limiting role in questions. However, most existing methods for KBQA cannot model a question with implicit temporal constraints. In this work, we propose a model based on a bidirectional attentive memory network, which obtains the temporal information in the question through attention mechanisms and external knowledge. Specifically, we encode the external knowledge as vectors, and use additive attention between the question and external knowledge to obtain the temporal information, then further enhance the question vector to increase the accuracy. On the WebQuestions benchmark, our method not only performs better with the overall data, but also has excellent performance regarding questions with implicit temporal constraints, which are separate from the overall data. As we use attention mechanisms, our method also offers better interpretability.

Download Full-text

A Composite Natural Language Processing and Information Retrieval Approach to Question Answering Using a Structured Knowledge Base

International Journal of Semantic Computing ◽

10.1142/s1793351x17400141 ◽

2017 ◽

Vol 11 (03) ◽

pp. 345-371

Author(s):

Avani Chandurkar ◽

Ajay Bansal

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Knowledge Base ◽

Language Processing ◽

Question Answering ◽

Automated System ◽

Free Form ◽

Question Answering System ◽

Novel Approach

With the inception of the World Wide Web, the amount of data present on the Internet is tremendous. This makes the task of navigating through this enormous amount of data quite difficult for the user. As users struggle to navigate through this wealth of information, the need for the development of an automated system that can extract the required information becomes urgent. This paper presents a Question Answering system to ease the process of information retrieval. Question Answering systems have been around for quite some time and are a sub-field of information retrieval and natural language processing. The task of any Question Answering system is to seek an answer to a free form factual question. The difficulty of pinpointing and verifying the precise answer makes question answering more challenging than simple information retrieval done by search engines. The research objective of this paper is to develop a novel approach to Question Answering based on a composition of conventional approaches of Information Retrieval (IR) and Natural Language processing (NLP). The focus is on using a structured and annotated knowledge base instead of an unstructured one. The knowledge base used here is DBpedia and the final system is evaluated on the Text REtrieval Conference (TREC) 2004 questions dataset.

Download Full-text

Knowledge bases and description logics applications to natural language texts analysis

PROBLEMS IN PROGRAMMING ◽

10.15407/pp2020.02-03.259 ◽

2020 ◽

pp. 259-269

Author(s):

H.I. Hoherchak ◽

Keyword(s):

Natural Language ◽

Language Processing ◽

Question Answering ◽

Semantic Analysis ◽

Description Logics ◽

Knowledge Bases ◽

Parts Of Speech ◽

Resolution Problem ◽

Concepts Of Knowledge ◽

Speech Tagging

The article describes some ways of knowledge bases application to natural language texts analysis and solving some of their processing tasks. The basic problems of natural language processing are considered, which are the basis for their semantic analysis: problems of tokenization, parts of speech tagging, dependency parsing, correference resolution. The basic concepts of knowledge bases theory are presented and the approach to their filling based on Universal Dependencies framework and the correference resolution problem is proposed. Examples of applications for knowledge bases filled with natural language texts in practical problems are given, including checking constructed syntactic and semantic models for consistency and question answering.

Download Full-text

Evaluation of Single-Span Models on Extractive Multi-Span Question-Answering

International journal of Web & Semantic Technology ◽

10.5121/ijwest.2021.12102 ◽

2021 ◽

Vol 12 (1) ◽

pp. 19-29

Author(s):

Marie-Anne Xu ◽

Rahul Khanna

Keyword(s):

Reading Comprehension ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Future Development ◽

Question Answering ◽

Consistent Performance ◽

Machine Reading ◽

Entire Dataset

Machine Reading Comprehension (MRC), particularly extractive close-domain question-answering, is a prominent field in Natural Language Processing (NLP). Given a question and a passage or set of passages, a machine must be able to extract the appropriate answer from the passage(s). However, the majority of these existing questions have only one answer, and more substantial testing on questions with multiple answers, or multi-span questions, has not yet been applied. Thus, we introduce a newly compiled dataset consisting of questions with multiple answers that originate from previously existing datasets. In addition, we run BERT-based models pre-trained for question-answering on our constructed dataset to evaluate their reading comprehension abilities. Runtime of base models on the entire dataset is approximately one day while the runtime for all models on a third of the dataset is a little over two days. Among the three of BERT-based models we ran, RoBERTa exhibits the highest consistent performance, regardless of size. We find that all our models perform similarly on this new, multi-span dataset compared to the single-span source datasets. While the models tested on the source datasets were slightly fine-tuned in order to return multiple answers, performance is similar enough to judge that task formulation does not drastically affect question-answering abilities. Our evaluations indicate that these models are indeed capable of adjusting to answer questions that require multiple answers. We hope that our findings will assist future development in question-answering and improve existing question-answering products and methods.

Download Full-text

Special Thematic Section on Semantic Models for Natural Language Processing (Preface)

Cybernetics and Information Technologies ◽

10.2478/cait-2018-0008 ◽

2018 ◽

Vol 18 (1) ◽

pp. 93-94

Author(s):

Kiril Simov ◽

Petya Osenova

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Open Data ◽

Knowledge Bases ◽

Lexical Resources ◽

Text Simplification ◽

Semantic Models ◽

Semantic Resources

Abstract With the availability of large language data online, cross-linked lexical resources (such as BabelNet, Predicate Matrix and UBY) and semantically annotated corpora (SemCor, OntoNotes, etc.), more and more applications in Natural Language Processing (NLP) have started to exploit various semantic models. The semantic models have been created on the base of LSA, clustering, word embeddings, deep learning, neural networks, etc., and abstract logical forms, such as Minimal Recursion Semantics (MRS) or Abstract Meaning Representation (AMR), etc. Additionally, the Linguistic Linked Open Data Cloud has been initiated (LLOD Cloud) which interlinks linguistic data for improving the tasks of NLP. This cloud has been expanding enormously for the last four-five years. It includes corpora, lexicons, thesauri, knowledge bases of various kinds, organized around appropriate ontologies, such as LEMON. The semantic models behind the data organization as well as the representation of the semantic resources themselves are a challenge to the NLP community. The NLP applications that extensively rely on the above discussed models include Machine Translation, Information Extraction, Question Answering, Text Simplification, etc.

Download Full-text

A Knowledge-Based Sense Disambiguation Method to Semantically Enhanced NL Question for Restricted Domain

Information ◽

10.3390/info12110452 ◽

2021 ◽

Vol 12 (11) ◽

pp. 452

Author(s):

Ammar Arbaaeen ◽

Asadullah Shah

Keyword(s):

Natural Language ◽

Language Processing ◽

Question Answering ◽

Word Sense Disambiguation ◽

Knowledge Bases ◽

Word Sense ◽

Intended Meaning ◽

Lexical Semantic ◽

Knowledge Based ◽

Sense Disambiguation

Within the space of question answering (QA) systems, the most critical module to improve overall performance is question analysis processing. Extracting the lexical semantic of a Natural Language (NL) question presents challenges at syntactic and semantic levels for most QA systems. This is due to the difference between the words posed by a user and the terms presently stored in the knowledge bases. Many studies have achieved encouraging results in lexical semantic resolution on the topic of word sense disambiguation (WSD), and several other works consider these challenges in the context of QA applications. Additionally, few scholars have examined the role of WSD in returning potential answers corresponding to particular questions. However, natural language processing (NLP) is still facing several challenges to determine the precise meaning of various ambiguities. Therefore, the motivation of this work is to propose a novel knowledge-based sense disambiguation (KSD) method for resolving the problem of lexical ambiguity associated with questions posed in QA systems. The major contribution is the proposed innovative method, which incorporates multiple knowledge sources. This includes the question’s metadata (date/GPS), context knowledge, and domain ontology into a shallow NLP. The proposed KSD method is developed into a unique tool for a mobile QA application that aims to determine the intended meaning of questions expressed by pilgrims. The experimental results reveal that our method obtained comparable and better accuracy performance than the baselines in the context of the pilgrimage domain.

Download Full-text

Software architecture of the question-answering subsystem with elements of self-learning

Artificial Intelligence ◽

10.15407/jai2021.02.088 ◽

2021 ◽

Vol 26 (jai2021.26(2)) ◽

pp. 88-95

Author(s):

Hlybovets A ◽

◽

Tsaruk A ◽

Keyword(s):

Natural Language ◽

Language Processing ◽

Speech Synthesis ◽

Question Answering ◽

Building Blocks ◽

Knowledge Bases ◽

Learning Technologies ◽

Software Systems ◽

Question Answering Systems ◽

Self Learning

Within the framework of this paper, the analysis of software systems of question-answering type and their basic architectures has been carried out. With the development of machine learning technologies, creation of natural language processing (NLP) engines, as well as the rising popularity of virtual personal assistant programs that use the capabilities of speech synthesis (text-to-speech), there is a growing need in developing question-answering systems which can provide personalized answers to users' questions. All modern cloud providers proposed frameworks for organization of question answering systems but still we have a problem with personalized dialogs. Personalization is very important, it can put forward additional demands to a question-answering system’s capabilities to take this information into account while processing users’ questions. Traditionally, a question-answering system (QAS) is developed in the form of an application that contains a knowledge base and a user interface, which provides a user with answers to questions, and a means of interaction with an expert. In this article we analyze modern approaches to architecture development and try to build system from the building blocks that already exist on the market. Main criteria for the NLP modules were: support of the Ukrainian language, natural language understanding, functions of automatic definition of entities (attributes), ability to construct a dialogue flow, quality and completeness of documentation, API capabilities and integration with external systems, possibilities of external knowledge bases integration After provided analyses article propose the detailed architecture of the question-answering subsystem with elements of self-learning in the Ukrainian language. In the work you can find detailed description of main semantic components of the system (architecture components)

Download Full-text

Question Answering from Procedural Semantics to Model Discovery

Encyclopedia of Human Computer Interaction ◽

10.4018/978-1-59140-562-7.ch072 ◽

2006 ◽

pp. 479-485

Author(s):

John Kontos ◽

Ioanna Malagardi

Keyword(s):

Natural Language ◽

World Wide ◽

Question Answering ◽

Knowledge Bases ◽

Procedural Semantics ◽

Short Answer ◽

Text Collections ◽

The World ◽

Fact Retrieval ◽

Parameter Values

Question Answering (QA) is one of the branches of Artificial Intelligence (AI) that involves the processing of human language by computer. QA systems accept questions in natural language and generate answers often in natural language. The answers are derived from databases, text collections, and knowledge bases. The main aim of QA systems is to generate a short answer to a question rather than a list of possibly relevant documents. As it becomes more and more difficult to find answers on the World Wide Web (WWW) using standard search engines, the technology of QA systems will become increasingly important. A series of systems that can answer questions from various data or knowledge sources are briefly described. These systems provide a friendly interface to the user of information systems that is particularly important for users who are not computer experts. The line of development of ideas starts with procedural semantics and leads to interfaces that support researchers for the discovery of parameter values of causal models of systems under scientific study. QA systems historically developed roughly during the 1960-1970 decade (Simmons, 1970). A few of the QA systems that were implemented during this decade are: • The BASEBALL system (Green et al., 1961) • The FACT RETRIEVAL System (Cooper, 1964) • The DELFI systems (Kontos & Kossidas, 1971; Kontos & Papakontantinou, 1970)

Download Full-text