Contextualized Knowledge-aware Attentive Neural Network: Enhancing Answer Selection with Knowledge

Yang Deng; Yuexiang Xie; Yaliang Li; Min Yang; Wai Lam; Ying Shen

doi:10.1145/3457533

Contextualized Knowledge-aware Attentive Neural Network: Enhancing Answer Selection with Knowledge

ACM Transactions on Information Systems ◽

10.1145/3457533 ◽

2022 ◽

Vol 40 (1) ◽

pp. 1-33

Author(s):

Yang Deng ◽

Yuexiang Xie ◽

Yaliang Li ◽

Min Yang ◽

Wai Lam ◽

...

Keyword(s):

Neural Network ◽

Language Processing ◽

Question Answering ◽

Representation Learning ◽

Attention Mechanism ◽

Convolutional Network ◽

Structure Information ◽

Dialog Systems ◽

External Knowledge ◽

Knowledge Based

Answer selection, which is involved in many natural language processing applications, such as dialog systems and question answering (QA), is an important yet challenging task in practice, since conventional methods typically suffer from the issues of ignoring diverse real-world background knowledge. In this article, we extensively investigate approaches to enhancing the answer selection model with external knowledge from knowledge graph (KG). First, we present a context-knowledge interaction learning framework, Knowledge-aware Neural Network, which learns the QA sentence representations by considering a tight interaction with the external knowledge from KG and the textual information. Then, we develop two kinds of knowledge-aware attention mechanism to summarize both the context-based and knowledge-based interactions between questions and answers. To handle the diversity and complexity of KG information, we further propose a Contextualized Knowledge-aware Attentive Neural Network, which improves the knowledge representation learning with structure information via a customized Graph Convolutional Network and comprehensively learns context-based and knowledge-based sentence representation via the multi-view knowledge-aware attention mechanism. We evaluate our method on four widely used benchmark QA datasets, including WikiQA, TREC QA, InsuranceQA, and Yahoo QA. Results verify the benefits of incorporating external knowledge from KG and show the robust superiority and extensive applicability of our method.

Download Full-text

Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6364 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8449-8456 ◽

Cited By ~ 2

Author(s):

Shangwen Lv ◽

Daya Guo ◽

Jingjing Xu ◽

Duyu Tang ◽

Nan Duan ◽

...

Keyword(s):

Question Answering ◽

Structural Information ◽

Representation Learning ◽

Knowledge Bases ◽

Knowledge Sources ◽

Convolutional Network ◽

External Knowledge ◽

Relational Structures ◽

Word Representation ◽

Final Answer

Commonsense question answering aims to answer questions which require background knowledge that is not explicitly expressed in the question. The key challenge is how to obtain evidence from external knowledge and make predictions based on the evidence. Recent studies either learn to generate evidence from human-annotated evidence which is expensive to collect, or extract evidence from either structured or unstructured knowledge bases which fails to take advantages of both sources simultaneously. In this work, we propose to automatically extract evidence from heterogeneous knowledge sources, and answer questions based on the extracted evidence. Specifically, we extract evidence from both structured knowledge base (i.e. ConceptNet) and Wikipedia plain texts. We construct graphs for both sources to obtain the relational structures of evidence. Based on these graphs, we propose a graph-based approach consisting of a graph-based contextual word representation learning module and a graph-based inference module. The first module utilizes graph structural information to re-define the distance between words for learning better contextual word representations. The second module adopts graph convolutional network to encode neighbor information into the representations of nodes, and aggregates evidence with graph attention mechanism for predicting the final answer. Experimental results on CommonsenseQA dataset illustrate that our graph-based approach over both knowledge sources brings improvement over strong baselines. Our approach achieves the state-of-the-art accuracy (75.3%) on the CommonsenseQA dataset.

Download Full-text

Semantic Sentence Matching with Densely-Connected Recurrent and Co-Attentive Information

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016586 ◽

2019 ◽

Vol 33 ◽

pp. 6586-6593 ◽

Cited By ~ 20

Author(s):

Seonhoon Kim ◽

Inho Kang ◽

Nojun Kwak

Keyword(s):

Neural Network ◽

Natural Language ◽

Question Answering ◽

State Of The Art ◽

Attention Mechanism ◽

Semantic Relationship ◽

Convolutional Network ◽

Benchmark Datasets ◽

Feature Information ◽

Sentence Matching

Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. For these tasks, understanding logical and semantic relationship between two sentences is required but it is yet challenging. Although attention mechanism is useful to capture the semantic relationship and to properly align the elements of two sentences, previous methods of attention mechanism simply use a summation operation which does not retain original features enough. Inspired by DenseNet, a densely connected convolutional network, we propose a densely-connected co-attentive recurrent neural network, each layer of which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers. It enables preserving the original and the co-attentive feature information from the bottommost word embedding layer to the uppermost recurrent layer. To alleviate the problem of an ever-increasing size of feature vectors due to dense concatenation operations, we also propose to use an autoencoder after dense concatenation. We evaluate our proposed architecture on highly competitive benchmark datasets related to sentence matching. Experimental results show that our architecture, which retains recurrent and attentive features, achieves state-of-the-art performances for most of the tasks.

Download Full-text

A Novel Bidirectional LSTM and Attention Mechanism based Neural Network for Answer Selection in Community Question Answering

Computers Materials & Continua ◽

10.32604/cmc.2020.07269 ◽

2020 ◽

Vol 62 (3) ◽

pp. 1273-1288

Author(s):

Zhang Bo ◽

Wang Haowen ◽

Jiang Longquan ◽

Yuan Shuhan ◽

Li Meizi

Keyword(s):

Neural Network ◽

Question Answering ◽

Attention Mechanism ◽

Community Question Answering ◽

Bidirectional Lstm

Download Full-text

Masking Orchestration: Multi-Task Pretraining for Multi-Role Dialogue Representation Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6459 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9217-9224

Author(s):

Tianyi Wang ◽

Yating Zhang ◽

Xiaozhong Liu ◽

Changlong Sun ◽

Qiong Zhang

Keyword(s):

Knowledge Integration ◽

Question Answering ◽

Representation Learning ◽

Essential Information ◽

External Knowledge ◽

Learning Tasks ◽

Wide Range ◽

Specific Learning

Multi-role dialogue understanding comprises a wide range of diverse tasks such as question answering, act classification, dialogue summarization etc. While dialogue corpora are abundantly available, labeled data, for specific learning tasks, can be highly scarce and expensive. In this work, we investigate dialogue context representation learning with various types unsupervised pretraining tasks where the training objectives are given naturally according to the nature of the utterance and the structure of the multi-role conversation. Meanwhile, in order to locate essential information for dialogue summarization/extraction, the pretraining process enables external knowledge integration. The proposed fine-tuned pretraining mechanism is comprehensively evaluated via three different dialogue datasets along with a number of downstream dialogue-mining tasks. Result shows that the proposed pretraining mechanism significantly contributes to all the downstream tasks without discrimination to different encoders.

Download Full-text

LIS4: Lesk Inspired Sense Specific Semantic Similarity using WordNet

Journal of Information & Knowledge Management ◽

10.1142/s0219649221500064 ◽

2021 ◽

pp. 2150006

Author(s):

Saravanakumar Kandasamy ◽

Aswani Kumar Cherukuri

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Gold Standard ◽

Question Answering ◽

Knowledge Based ◽

Benchmark Datasets ◽

Processing Information

Semantic similarity quantification between concepts is one of the inevitable parts in domains like Natural Language Processing, Information Retrieval, Question Answering, etc. to understand the text and their relationships better. Last few decades, many measures have been proposed by incorporating various corpus-based and knowledge-based resources. WordNet and Wikipedia are two of the Knowledge-based resources. The contribution of WordNet in the above said domain is enormous due to its richness in defining a word and all of its relationship with others. In this paper, we proposed an approach to quantify the similarity between concepts that exploits the synsets and the gloss definitions of different concepts using WordNet. Our method considers the gloss definitions, contextual words that are helping in defining a word, synsets of contextual word and the confidence of occurrence of a word in other word’s definition for calculating the similarity. The evaluation based on different gold standard benchmark datasets shows the efficiency of our system in comparison with other existing taxonomical and definitional measures.

Download Full-text

Answer Selection in Community Question Answering Using LSTM

10.4018/978-1-7998-8061-5.ch008 ◽

2021 ◽

pp. 153-165

Author(s):

Saman Qureshi ◽

Sri Khetwat Saritha ◽

D. Kishan

Keyword(s):

Neural Network ◽

Question Answering ◽

Attention Mechanism ◽

Generalization Ability ◽

Mechanism Model ◽

Community Question Answering ◽

Informative Part ◽

External Sources

Websites like Quora, Yahoo! Answers, and Reddit are examples of community question answering (CQA) systems that enable users to ask questions as well as to answer questions. Answer selection is the most challenging task in CQA systems to get the good and relevant answer for the user questions. The shortcomings in the current approaches are lexical gap between text pairs, dependency on external sources, and manual features which lead to lack of generalization ability. These shortcomings are resolved by already proposed work, but they lack generalization, and their performance is not satisfying. Whereas to focus on rich quality answers, attention mechanism can be integrates with neural network. This chapter proposes two models BLSTM and BLSTM with attention mechanism. Attention mechanism aligns question to the answer with the answer's more informative part. So, when it is applied in the model, BLSTM with attention mechanism model surpasses the top approaches.

Download Full-text

Question Answering

10.1093/oxfordhb/9780199276349.013.0031 ◽

2012 ◽

Cited By ~ 1

Author(s):

Sanda Harabagiu ◽

Dan Moldovan

Keyword(s):

Language Processing ◽

Question Answering ◽

Document Retrieval ◽

Semantic Features ◽

Knowledge Based ◽

Answer Extraction ◽

Small Set ◽

Processing Module ◽

On Line ◽

Processing Techniques

Textual Question Answering (QA) identifies the answer to a question in large collections of on-line documents. By providing a small set of exact answers to questions, QA takes a step closer to information retrieval rather than document retrieval. A QA system comprises three modules: a question-processing module, a document-processing module, and an answer extraction and formulation module. Questions may be asked about any topic, in contrast with Information Extraction (IE), which identifies textual information relevant only to a predefined set of events and entities. The natural language processing (NLP) techniques used in open-domain QA systems may range from simple lexical and semantic disambiguation of question stems to complex processing that combines syntactic and semantic features of the questions with pragmatic information derived from the context of candidate answers. This article reviews current research in integrating knowledge-based NLP methods with shallow processing techniques for QA.

Download Full-text

SLAM: A Malware Detection Method Based on Sliding Local Attention Mechanism

Security and Communication Networks ◽

10.1155/2020/6724513 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Jun Chen ◽

Shize Guo ◽

Xin Ma ◽

Haiying Li ◽

Jinhong Guo ◽

...

Keyword(s):

Language Processing ◽

Semantic Information ◽

Attention Mechanism ◽

Malicious Code ◽

Feature Extraction Method ◽

Structure Information ◽

Execution Sequence ◽

Window Method ◽

Great Progress ◽

Sliding Window Method

Since the number of malware is increasing rapidly, it continuously poses a risk to the field of network security. Attention mechanism has made great progress in the field of natural language processing. At the same time, there are many research studies based on malicious code API, which is also like semantic information. It is a worthy study to apply attention mechanism to API semantics. In this paper, we firstly study the characters of the API execution sequence and classify them into 17 categories. Secondly, we propose a novel feature extraction method based on API execution sequence according to its semantics and structure information. Thirdly, based on the API data characteristics and attention mechanism features, we construct a detection framework SLAM based on local attention mechanism and sliding window method. Experiments show that our model achieves a better performance, which is a higher accuracy of 0.9723.

Download Full-text

Convolution- and Attention-Based Neural Network for Automated Sleep Stage Classification

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17114152 ◽

2020 ◽

Vol 17 (11) ◽

pp. 4152 ◽

Cited By ~ 3

Author(s):

Tianqi Zhu ◽

Wei Luo ◽

Feng Yu

Keyword(s):

Neural Network ◽

Sleep Stage ◽

Attention Mechanism ◽

Set Partitioning ◽

Sleep Health ◽

Convolutional Network ◽

The Public ◽

Signal Characteristics ◽

Sleep Stage Classification ◽

Deep Learning Model

Analyzing polysomnography (PSG) is an effective method for evaluating sleep health; however, the sleep stage scoring required for PSG analysis is a time-consuming effort for an experienced medical expert. When scoring sleep epochs, experts pay attention to find specific signal characteristics (e.g., K-complexes and spindles), and sometimes need to integrate information from preceding and subsequent epochs in order to make a decision. To imitate this process and to build a more interpretable deep learning model, we propose a neural network based on a convolutional network (CNN) and attention mechanism to perform automatic sleep staging. The CNN learns local signal characteristics, and the attention mechanism excels in learning inter- and intra-epoch features. In experiments on the public sleep-edf and sleep-edfx databases with different training and testing set partitioning methods, our model achieved overall accuracies of 93.7% and 82.8%, and macro-average F1-scores of 84.5 and 77.8, respectively, outperforming recently reported machine learning-based methods.

Download Full-text

A comparative review on deep learning models for text classification

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v19.i1.pp325-335 ◽

2020 ◽

Vol 19 (1) ◽

pp. 325

Author(s):

Muhammad Zulqarnain ◽

Rozaida Ghazali ◽

Yana Mazwin Mohmad Hassim ◽

Muhammad Rehan

Keyword(s):

Neural Network ◽

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Question Answering ◽

Learning Models ◽

Semantic Classification ◽

Analysis Question ◽

Comparative Review ◽

Classification Tasks

<p>Text classification is a fundamental task in several areas of natural language processing (NLP), including words semantic classification, sentiment analysis, question answering, or dialog management. This paper investigates three basic architectures of deep learning models for the tasks of text classification: Deep Belief Neural (DBN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), these three main types of deep learning architectures, are largely explored to handled various classification tasks. DBN have excellent learning capabilities to extracts highly distinguishable features and good for general purpose. CNN have supposed to be better at extracting the position of various related features while RNN is modeling in sequential of long-term dependencies. This paper work shows the systematic comparison of DBN, CNN, and RNN on text classification tasks. Finally, we show the results of deep models by research experiment. The aim of this paper to provides basic guidance about the deep learning models that which models are best for the task of text classification.</p>

Download Full-text