Chapter 13. Generalizable Neuro-Symbolic Systems for Commonsense Question Answering

This chapter illustrates how suitable neuro-symbolic models for language understanding can enable domain generalizability and robustness in downstream tasks. Different methods for integrating neural language models and knowledge graphs are discussed. The situations in which this combination is most appropriate are characterized, including quantitative evaluation and qualitative error analysis on a variety of commonsense question answering benchmark datasets.

Download Full-text

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

10.18653/v1/2021.naacl-main.45 ◽

2021 ◽

Author(s):

Michihiro Yasunaga ◽

Hongyu Ren ◽

Antoine Bosselut ◽

Percy Liang ◽

Jure Leskovec

Keyword(s):

Question Answering ◽

Language Models ◽

Knowledge Graphs

Download Full-text

What is in the KGQA Benchmark Datasets? Survey on Challenges in Datasets for Question Answering on Knowledge Graphs

Journal on Data Semantics ◽

10.1007/s13740-021-00128-9 ◽

2021 ◽

Author(s):

Nadine Steinmetz ◽

Kai-Uwe Sattler

Keyword(s):

Natural Language ◽

Question Answering ◽

Research Evaluation ◽

Benchmark Dataset ◽

Complex Queries ◽

Benchmark Datasets ◽

Knowledge Graphs

AbstractQuestion Answering based on Knowledge Graphs (KGQA) still faces difficult challenges when transforming natural language (NL) to SPARQL queries. Simple questions only referring to one triple are answerable by most QA systems, but more complex questions requiring complex queries containing subqueries or several functions are still a tough challenge within this field of research. Evaluation results of QA systems therefore also might depend on the benchmark dataset the system has been tested on. For the purpose to give an overview and reveal specific characteristics, we examined currently available KGQA datasets regarding several challenging aspects. This paper presents a detailed look into the datasets and compares them in terms of challenges a KGQA system is facing.

Download Full-text

Natural language understanding and the perspectives of question answering

10.3115/991813.991871 ◽

1982 ◽

Author(s):

Petr Sgall

Keyword(s):

Natural Language ◽

Question Answering ◽

Natural Language Understanding ◽

Language Understanding

Download Full-text

Towards corpus and model: Hierarchical structured-attention-based features for Indonesian named entity recognition

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202286 ◽

2021 ◽

pp. 1-12

Author(s):

Yingwen Fu ◽

Nankai Lin ◽

Xiaotian Lin ◽

Shengyi Jiang

Keyword(s):

Language Processing ◽

State Of The Art ◽

Named Entity Recognition ◽

Entity Recognition ◽

Language Models ◽

Neural Models ◽

Performance Models ◽

Named Entity ◽

High Resource ◽

Benchmark Datasets

Named entity recognition (NER) is fundamental to natural language processing (NLP). Most state-of-the-art researches on NER are based on pre-trained language models (PLMs) or classic neural models. However, these researches are mainly oriented to high-resource languages such as English. While for Indonesian, related resources (both in dataset and technology) are not yet well-developed. Besides, affix is an important word composition for Indonesian language, indicating the essentiality of character and token features for token-wise Indonesian NLP tasks. However, features extracted by currently top-performance models are insufficient. Aiming at Indonesian NER task, in this paper, we build an Indonesian NER dataset (IDNER) comprising over 50 thousand sentences (over 670 thousand tokens) to alleviate the shortage of labeled resources in Indonesian. Furthermore, we construct a hierarchical structured-attention-based model (HSA) for Indonesian NER to extract sequence features from different perspectives. Specifically, we use an enhanced convolutional structure as well as an enhanced attention structure to extract deeper features from characters and tokens. Experimental results show that HSA establishes competitive performance on IDNER and three benchmark datasets.

Download Full-text

Intent Detection and Slot Filling with Capsule Net Architectures for a Romanian Home Assistant

Sensors ◽

10.3390/s21041230 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1230

Author(s):

Anda Stoica ◽

Tibor Kadar ◽

Camelia Lemnaru ◽

Rodica Potolea ◽

Mihaela Dînşoreanu

Keyword(s):

Neural Network ◽

Error Analysis ◽

Natural Language ◽

Network Model ◽

Network Architecture ◽

Natural Language Understanding ◽

Wide Spread ◽

Neural Network Architecture ◽

Language Understanding ◽

Slot Filling

As virtual home assistants are becoming more popular, there is an emerging need for supporting languages other than English. While more wide-spread or popular languages such as Spanish, French or Hindi are already integrated into existing home assistants like Google Home or Alexa, integration of other less-known languages such as Romanian is still missing. This paper explores the problem of Natural Language Understanding (NLU) applied to a Romanian home assistant. We propose a customized capsule neural network architecture that performs intent detection and slot filling in a joint manner and we evaluate how well it handles utterances containing various levels of complexity. The capsule network model shows a significant improvement in intent detection when compared to models built using the well-known Rasa NLU tool. Through error analysis, we observe clear error patterns that occur systematically. Variability in language when expressing one intent proves to be the biggest challenge encountered by the model.

Download Full-text

A model for quantitative evaluation of an end-to-end question-answering system

Journal of the American Society for Information Science and Technology ◽

10.1002/asi.20560 ◽

2007 ◽

Vol 58 (8) ◽

pp. 1082-1099 ◽

Cited By ~ 17

Author(s):

Nina Wacholder ◽

Diane Kelly ◽

Paul Kantor ◽

Robert Rittman ◽

Ying Sun ◽

...

Keyword(s):

Quantitative Evaluation ◽

Question Answering ◽

Question Answering System ◽

End To End

Download Full-text

Astrid

Proceedings of the VLDB Endowment ◽

10.14778/3436905.3436907 ◽

2020 ◽

Vol 14 (4) ◽

pp. 471-484

Author(s):

Suraj Shetiya ◽

Saravanan Thirumuruganathan ◽

Nick Koudas ◽

Gautam Das

Keyword(s):

Deep Learning ◽

Objective Function ◽

Pattern Matching ◽

Language Processing ◽

Language Model ◽

Language Models ◽

Selectivity Estimation ◽

Statistical Correlations ◽

Benchmark Datasets ◽

Traditional Approaches

Accurate selectivity estimation for string predicates is a long-standing research challenge in databases. Supporting pattern matching on strings (such as prefix, substring, and suffix) makes this problem much more challenging, thereby necessitating a dedicated study. Traditional approaches often build pruned summary data structures such as tries followed by selectivity estimation using statistical correlations. However, this produces insufficiently accurate cardinality estimates resulting in the selection of sub-optimal plans by the query optimizer. Recently proposed deep learning based approaches leverage techniques from natural language processing such as embeddings to encode the strings and use it to train a model. While this is an improvement over traditional approaches, there is a large scope for improvement. We propose Astrid, a framework for string selectivity estimation that synthesizes ideas from traditional and deep learning based approaches. We make two complementary contributions. First, we propose an embedding algorithm that is query-type (prefix, substring, and suffix) and selectivity aware. Consider three strings 'ab', 'abc' and 'abd' whose prefix frequencies are 1000, 800 and 100 respectively. Our approach would ensure that the embedding for 'ab' is closer to 'abc' than 'abd'. Second, we describe how neural language models could be used for selectivity estimation. While they work well for prefix queries, their performance for substring queries is sub-optimal. We modify the objective function of the neural language model so that it could be used for estimating selectivities of pattern matching queries. We also propose a novel and efficient algorithm for optimizing the new objective function. We conduct extensive experiments over benchmark datasets and show that our proposed approaches achieve state-of-the-art results.

Download Full-text

Audio-aware Spoken Multiple-choice Question Answering with Pre-trained Language Models

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2021.3120638 ◽

2021 ◽

pp. 1-1

Author(s):

Chia-Chih Kuo ◽

Kuan-Yu Chen ◽

Shang-Bao Luo

Keyword(s):

Question Answering ◽

Multiple Choice ◽

Multiple Choice Question ◽

Language Models

Download Full-text

A Survey of Techniques for Constructing Chinese Knowledge Graphs and Their Applications

Sustainability ◽

10.3390/su10093245 ◽

2018 ◽

Vol 10 (9) ◽

pp. 3245 ◽

Cited By ~ 7

Author(s):

Tianxing Wu ◽

Guilin Qi ◽

Cheng Li ◽

Meng Wang

Keyword(s):

Artificial Intelligence ◽

Question Answering ◽

Knowledge Representation And Reasoning ◽

Knowledge Graph ◽

Development History ◽

One Belt One Road ◽

History Of ◽

Knowledge Graphs ◽

The One ◽

The Impact

With the continuous development of intelligent technologies, knowledge graph, the backbone of artificial intelligence, has attracted much attention from both academic and industrial communities due to its powerful capability of knowledge representation and reasoning. In recent years, knowledge graph has been widely applied in different kinds of applications, such as semantic search, question answering, knowledge management and so on. Techniques for building Chinese knowledge graphs are also developing rapidly and different Chinese knowledge graphs have been constructed to support various applications. Under the background of the “One Belt One Road (OBOR)” initiative, cooperating with the countries along OBOR on studying knowledge graph techniques and applications will greatly promote the development of artificial intelligence. At the same time, the accumulated experience of China in developing knowledge graphs is also a good reference to develop non-English knowledge graphs. In this paper, we aim to introduce the techniques of constructing Chinese knowledge graphs and their applications, as well as analyse the impact of knowledge graph on OBOR. We first describe the background of OBOR, and then introduce the concept and development history of knowledge graph and typical Chinese knowledge graphs. Afterwards, we present the details of techniques for constructing Chinese knowledge graphs, and demonstrate several applications of Chinese knowledge graphs. Finally, we list some examples to explain the potential impacts of knowledge graph on OBOR.

Download Full-text

Building Hybrid Representations from Text Corpora, Knowledge Graphs, and Language Models

A Practical Guide to Hybrid Natural Language Processing ◽

10.1007/978-3-030-44830-1_6 ◽

2020 ◽

pp. 57-89

Author(s):

Jose Manuel Gomez-Perez ◽

Ronald Denaux ◽

Andres Garcia-Silva

Keyword(s):

Language Models ◽

Text Corpora ◽

Knowledge Graphs

Download Full-text