Inherent Disagreements in Human Textual Inferences

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00293 ◽

2019 ◽

Vol 7 ◽

pp. 677-694

Author(s):

Ellie Pavlick ◽

Tom Kwiatkowski

Keyword(s):

Natural Language ◽

State Of The Art ◽

Current State ◽

Textual Entailment ◽

Recognizing Textual Entailment

We analyze human’s disagreements about the validity of natural language inferences. We show that, very often, disagreements are not dismissible as annotation “noise”, but rather persist as we collect more ratings and as we vary the amount of context provided to raters. We further show that the type of uncertainty captured by current state-of-the-art models for natural language inference is not reflective of the type of uncertainty present in human disagreements. We discuss implications of our results in relation to the recognizing textual entailment (RTE)/natural language inference (NLI) task. We argue for a refined evaluation objective that requires models to explicitly capture the full distribution of plausible human judgments.

Download Full-text

Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017410 ◽

2019 ◽

Vol 33 ◽

pp. 7410-7417 ◽

Cited By ~ 1

Author(s):

Masashi Yoshikawa ◽

Koji Mineshima ◽

Hiroshi Noji ◽

Daisuke Bekki

Keyword(s):

Natural Language ◽

Knowledge Base ◽

Processing Speed ◽

Processing Time ◽

State Of The Art ◽

Proof Automation ◽

New Knowledge ◽

Textual Entailment ◽

Amount Of Knowledge ◽

Recognizing Textual Entailment

In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is important for a system to have a large amount of knowledge data. However, there is a tradeoff between adding more knowledge data for improved RTE performance and maintaining an efficient RTE system, as such a big database is problematic in terms of the memory usage and computational complexity. In this work, we show the processing time of a state-of-the-art logic-based RTE system can be significantly reduced by replacing its search-based axiom injection (abduction) mechanism by that based on Knowledge Base Completion (KBC). We integrate this mechanism in a Coq plugin that provides a proof automation tactic for natural language inference. Additionally, we show empirically that adding new knowledge data contributes to better RTE performance while not harming the processing speed in this framework.

Download Full-text

SANTM: Efficient Self-attention-driven Network for Text Matching

ACM Transactions on Internet Technology ◽

10.1145/3426971 ◽

2022 ◽

Vol 22 (3) ◽

pp. 1-21

Author(s):

Prayag Tiwari ◽

Amit Kumar Jaiswal ◽

Sahil Garg ◽

Ilsun You

Keyword(s):

Natural Language ◽

State Of The Art ◽

The State ◽

Attention Mechanism ◽

Matching Problems ◽

Attention Model ◽

Extra Information ◽

Textual Entailment ◽

Benchmark Datasets ◽

Text Matching

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.

Download Full-text

Textual entailment graphs

Natural Language Engineering ◽

10.1017/s1351324915000108 ◽

2015 ◽

Vol 21 (5) ◽

pp. 699-724 ◽

Cited By ~ 6

Author(s):

LILI KOTLERMAN ◽

IDO DAGAN ◽

BERNARDO MAGNINI ◽

LUISA BENTIVOGLI

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Gold Standard ◽

State Of The Art ◽

Text Analytics ◽

Joint Work ◽

Gold Standard Dataset ◽

Textual Entailment ◽

Interesting Task

AbstractIn this work, we present a novel type of graphs for natural language processing (NLP), namely textual entailment graphs (TEGs). We describe the complete methodology we developed for the construction of such graphs and provide some baselines for this task by evaluating relevant state-of-the-art technology. We situate our research in the context of text exploration, since it was motivated by joint work with industrial partners in the text analytics area. Accordingly, we present our motivating scenario and the first gold-standard dataset of TEGs. However, while our own motivation and the dataset focus on the text exploration setting, we suggest that TEGs can have different usages and suggest that automatic creation of such graphs is an interesting task for the community.

Download Full-text

Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

Journal of Artificial Intelligence Research ◽

10.1613/jair.5477 ◽

2018 ◽

Vol 61 ◽

pp. 65-170 ◽

Cited By ~ 68

Author(s):

Albert Gatt ◽

Emiel Krahmer

Keyword(s):

Natural Language ◽

State Of The Art ◽

Natural Language Generation ◽

Data Driven ◽

Research Topics ◽

Language Generation ◽

The Past ◽

Current State ◽

Linguistic Input ◽

New Applications

This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past two decades, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of NLP, with an emphasis on different evaluation methods and the relationships between them.

Download Full-text

Natural language interfaces to databases – an introduction

Natural Language Engineering ◽

10.1017/s135132490000005x ◽

1995 ◽

Vol 1 (1) ◽

pp. 29-81 ◽

Cited By ~ 283

Author(s):

I. Androutsopoulos ◽

G.D. Ritchie ◽

P. Thanisch

Keyword(s):

Natural Language ◽

Computational Linguistics ◽

State Of The Art ◽

Query Languages ◽

Natural Language Interfaces ◽

Advantages And Disadvantages ◽

Current State ◽

Database Updates ◽

Graphical Interfaces ◽

History Of

AbstractThis paper is an introduction to natural language interfaces to databases (NLIDBS). A brief overview of the history of NLIDBS is first given. Some advantages and disadvantages of NLIDBS are then discussed, comparing NLIDBS to formal query languages, form-based interfaces, and graphical interfaces. An introduction to some of the linguistic problems NLIDBS have to confront follows, for the benefit of readers less familiar with computational linguistics. The discussion then moves on to NLIDB architectures, portability issues, restricted natural language input systems (including menu-based NLIDBS), and NLIDBS with reasoning capabilities. Some less explored areas of NLIDB research are then presented, namely database updates, meta-knowledge questions, temporal questions, and multi-modal NLIDBS. The paper ends with reflections on the current state of the art.

Download Full-text

Discourse structure and language technology

Natural Language Engineering ◽

10.1017/s1351324911000337 ◽

2011 ◽

Vol 18 (4) ◽

pp. 437-490 ◽

Cited By ~ 34

Author(s):

B. WEBBER ◽

M. EGG ◽

V. KORDONI

Keyword(s):

Natural Language ◽

State Of The Art ◽

Discourse Structure ◽

Algorithm Performance ◽

Language Engineering ◽

Language Technology ◽

Current State ◽

Technology Applications ◽

Formal Properties

AbstractAn increasing number of researchers and practitioners in Natural Language Engineering face the prospect of having to work with entire texts, rather than individual sentences. While it is clear that text must have useful structure, its nature may be less clear, making it more difficult to exploit in applications. This survey of work on discourse structure thus provides a primer on the bases of which discourse is structured along with some of their formal properties. It then lays out the current state-of-the-art with respect to algorithms for recognizing these different structures, and how these algorithms are currently being used in Language Technology applications. After identifying resources that should prove useful in improving algorithm performance across a range of languages, we conclude by speculating on future discourse structure-enabled technology.

Download Full-text

Natural language interfaces to databases

The Knowledge Engineering Review ◽

10.1017/s0269888900005476 ◽

1990 ◽

Vol 5 (4) ◽

pp. 225-249 ◽

Cited By ~ 52

Author(s):

Ann Copestake ◽

Karen Sparck Jones

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Central Process ◽

Current State ◽

Natural Language Question ◽

The One ◽

Language Question ◽

And Task

AbstractThis paper reviews the current state of the art in natural language access to databases. This has been a long-standing area of work in natural language processing. But though some commercial systems are now available, providing front ends has proved much harder than was expected, and the necessary limitations on front ends have to be recognized. The paper discusses the issues, both general to language and task-specific, involved in front end design, and the way these have been addressed, concentrating on the work of the last decade. The focus is on the central process of translating a natural language question into a database query, but other supporting functions are also covered. The points are illustrated by the use of a single example application. The paper concludes with an evaluation of the current state, indicating that future progress will depend on the one hand on general advances in natural language processing, and on the other on expanding the capabilities of traditional databases.

Download Full-text

Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6311 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8018-8025 ◽

Cited By ~ 2

Author(s):

Di Jin ◽

Zhijing Jin ◽

Joey Tianyi Zhou ◽

Peter Szolovits

Keyword(s):

Machine Learning ◽

Natural Language ◽

Text Classification ◽

Recurrent Neural Networks ◽

State Of The Art ◽

Semantic Content ◽

Machine Learning Algorithms ◽

Textual Entailment ◽

Text Length ◽

Adversarial Examples

Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models. It is helpful to evaluate or even improve the robustness of these models by exposing the maliciously crafted adversarial examples. In this paper, we present TextFooler, a simple but strong baseline to generate adversarial text. By applying it to two fundamental natural language tasks, text classification and textual entailment, we successfully attacked three target models, including the powerful pre-trained BERT, and the widely used convolutional and recurrent neural networks. We demonstrate three advantages of this framework: (1) effective—it outperforms previous attacks by success rate and perturbation rate, (2) utility-preserving—it preserves semantic content, grammaticality, and correct types classified by humans, and (3) efficient—it generates adversarial text with computational complexity linear to the text length.1

Download Full-text

Recognizing textual entailment: Rational, evaluation and approaches – Erratum

Natural Language Engineering ◽

10.1017/s1351324909990234 ◽

2010 ◽

Vol 16 (1) ◽

pp. 105-105 ◽

Cited By ~ 27

Author(s):

IDO DAGAN ◽

BILL DOLAN ◽

BERNARDO MAGNINI ◽

DAN ROTH

Keyword(s):

Natural Language ◽

Language Engineering ◽

Textual Entailment ◽

Recognizing Textual Entailment ◽

Rational Evaluation

Due to publisher error, this article was omitted from the printed issue of Natural Language Engineering volume 15 issue 4.It is published online in the correct volume (journals.cambridge.org/nle) and also printed here in volume 16 issue 1. Sincere apologies are extended to the authors for this error.

Download Full-text

Knowledge-Based Textual Inference via Parse-Tree Transformations

Journal of Artificial Intelligence Research ◽

10.1613/jair.4584 ◽

2015 ◽

Vol 54 ◽

pp. 1-57 ◽

Cited By ~ 2

Author(s):

Roy Bar-Haim ◽

Ido Dagan ◽

Jonathan Berant

Keyword(s):

Natural Language ◽

Relation Extraction ◽

Practical Applications ◽

Knowledge Based ◽

Tree Transformations ◽

Textual Entailment ◽

Automatic Methods ◽

Parse Trees ◽

Recognizing Textual Entailment ◽

Meaning Representation

Textual inference is an important component in many applications for understanding natural language. Classical approaches to textual inference rely on logical representations for meaning, which may be regarded as "external" to the natural language itself. However, practical applications usually adopt shallower lexical or lexical-syntactic representations, which correspond closely to language structure. In many cases, such approaches lack a principled meaning representation and inference framework. We describe an inference formalism that operates directly on language-based structures, particularly syntactic parse trees. New trees are generated by applying inference rules, which provide a unified representation for varying types of inferences. We use manual and automatic methods to generate these rules, which cover generic linguistic structures as well as specific lexical-based inferences. We also present a novel packed data-structure and a corresponding inference algorithm that allows efficient implementation of this formalism. We proved the correctness of the new algorithm and established its efficiency analytically and empirically. The utility of our approach was illustrated on two tasks: unsupervised relation extraction from a large corpus, and the Recognizing Textual Entailment (RTE) benchmarks.

Download Full-text