Semantic Sentence Matching with Densely-Connected Recurrent and Co-Attentive Information

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016586 ◽

2019 ◽

Vol 33 ◽

pp. 6586-6593 ◽

Cited By ~ 20

Author(s):

Seonhoon Kim ◽

Inho Kang ◽

Nojun Kwak

Keyword(s):

Neural Network ◽

Natural Language ◽

Question Answering ◽

State Of The Art ◽

Attention Mechanism ◽

Semantic Relationship ◽

Convolutional Network ◽

Benchmark Datasets ◽

Feature Information ◽

Sentence Matching

Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. For these tasks, understanding logical and semantic relationship between two sentences is required but it is yet challenging. Although attention mechanism is useful to capture the semantic relationship and to properly align the elements of two sentences, previous methods of attention mechanism simply use a summation operation which does not retain original features enough. Inspired by DenseNet, a densely connected convolutional network, we propose a densely-connected co-attentive recurrent neural network, each layer of which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers. It enables preserving the original and the co-attentive feature information from the bottommost word embedding layer to the uppermost recurrent layer. To alleviate the problem of an ever-increasing size of feature vectors due to dense concatenation operations, we also propose to use an autoencoder after dense concatenation. We evaluate our proposed architecture on highly competitive benchmark datasets related to sentence matching. Experimental results show that our architecture, which retains recurrent and attentive features, achieves state-of-the-art performances for most of the tasks.

Download Full-text

DRr-Net: Dynamic Re-Read Network for Sentence Semantic Matching

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017442 ◽

2019 ◽

Vol 33 ◽

pp. 7442-7449 ◽

Cited By ~ 3

Author(s):

Kun Zhang ◽

Guangyi Lv ◽

Linyuan Wang ◽

Le Wu ◽

Enhong Chen ◽

...

Keyword(s):

Natural Language ◽

Small Region ◽

Psychological Research ◽

Attention Mechanism ◽

Semantic Relations ◽

Semantic Matching ◽

Close Attention ◽

Original Sentence ◽

Benchmark Datasets ◽

Sentence Matching

Sentence semantic matching requires an agent to determine the semantic relation between two sentences, which is widely used in various natural language tasks such as Natural Language Inference (NLI) and Paraphrase Identification (PI). Among all matching methods, attention mechanism plays an important role in capturing the semantic relations and properly aligning the elements of two sentences. Previous methods utilized attention mechanism to select important parts of sentences at one time. However, the important parts of the sentence during semantic matching are dynamically changing with the degree of sentence understanding. Selecting the important parts at one time may be insufficient for semantic understanding. To this end, we propose a Dynamic Re-read Network (DRr-Net) approach for sentence semantic matching, which is able to pay close attention to a small region of sentences at each step and re-read the important words for better sentence semantic understanding. To be specific, we first employ Attention Stack-GRU (ASG) unit to model the original sentence repeatedly and preserve all the information from bottom-most word embedding input to up-most recurrent output. Second, we utilize Dynamic Re-read (DRr) unit to pay close attention to one important word at one time with the consideration of learned information and re-read the important words for better sentence semantic understanding. Extensive experiments on three sentence matching benchmark datasets demonstrate that DRr-Net has the ability to model sentence semantic more precisely and significantly improve the performance of sentence semantic matching. In addition, it is very interesting that some of finding in our experiments are consistent with the findings of psychological research.

Download Full-text

SANTM: Efficient Self-attention-driven Network for Text Matching

ACM Transactions on Internet Technology ◽

10.1145/3426971 ◽

2022 ◽

Vol 22 (3) ◽

pp. 1-21

Author(s):

Prayag Tiwari ◽

Amit Kumar Jaiswal ◽

Sahil Garg ◽

Ilsun You

Keyword(s):

Natural Language ◽

State Of The Art ◽

The State ◽

Attention Mechanism ◽

Matching Problems ◽

Attention Model ◽

Extra Information ◽

Textual Entailment ◽

Benchmark Datasets ◽

Text Matching

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.

Download Full-text

Bilateral Multi-Perspective Matching for Natural Language Sentences

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/579 ◽

2017 ◽

Cited By ~ 92

Author(s):

Zhiguo Wang ◽

Wael Hamza ◽

Radu Florian

Keyword(s):

Natural Language ◽

State Of The Art ◽

The State ◽

Experimental Results ◽

The Other ◽

Multiple Perspectives ◽

Time Step ◽

Benchmark Datasets ◽

Sentence Matching ◽

Fully Connected

Natural language sentence matching is a fundamental technology for a variety of tasks. Previous approaches either match sentences from a single direction or only apply single granular (word-by-word or sentence-by-sentence) matching. In this work, we propose a bilateral multi-perspective matching (BiMPM) model. Given two sentences P and Q, our model first encodes them with a BiLSTM encoder. Next, we match the two encoded sentences in two directions P against Q and P against Q. In each matching direction, each time step of one sentence is matched against all time-steps of the other sentence from multiple perspectives. Then, another BiLSTM layer is utilized to aggregate the matching results into a fix-length matching vector. Finally, based on the matching vector, a decision is made through a fully connected layer. We evaluate our model on three tasks: paraphrase identification, natural language inference and answer sentence selection. Experimental results on standard benchmark datasets show that our model achieves the state-of-the-art performance on all tasks.

Download Full-text

Contextualized Knowledge-aware Attentive Neural Network: Enhancing Answer Selection with Knowledge

ACM Transactions on Information Systems ◽

10.1145/3457533 ◽

2022 ◽

Vol 40 (1) ◽

pp. 1-33

Author(s):

Yang Deng ◽

Yuexiang Xie ◽

Yaliang Li ◽

Min Yang ◽

Wai Lam ◽

...

Keyword(s):

Neural Network ◽

Language Processing ◽

Question Answering ◽

Representation Learning ◽

Attention Mechanism ◽

Convolutional Network ◽

Structure Information ◽

Dialog Systems ◽

External Knowledge ◽

Knowledge Based

Answer selection, which is involved in many natural language processing applications, such as dialog systems and question answering (QA), is an important yet challenging task in practice, since conventional methods typically suffer from the issues of ignoring diverse real-world background knowledge. In this article, we extensively investigate approaches to enhancing the answer selection model with external knowledge from knowledge graph (KG). First, we present a context-knowledge interaction learning framework, Knowledge-aware Neural Network, which learns the QA sentence representations by considering a tight interaction with the external knowledge from KG and the textual information. Then, we develop two kinds of knowledge-aware attention mechanism to summarize both the context-based and knowledge-based interactions between questions and answers. To handle the diversity and complexity of KG information, we further propose a Contextualized Knowledge-aware Attentive Neural Network, which improves the knowledge representation learning with structure information via a customized Graph Convolutional Network and comprehensively learns context-based and knowledge-based sentence representation via the multi-view knowledge-aware attention mechanism. We evaluate our method on four widely used benchmark QA datasets, including WikiQA, TREC QA, InsuranceQA, and Yahoo QA. Results verify the benefits of incorporating external knowledge from KG and show the robust superiority and extensive applicability of our method.

Download Full-text

Attention as Relation: Learning Supervised Multi-head Self-Attention for Relation Extraction

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/524 ◽

2020 ◽

Author(s):

Jie Liu ◽

Shaowei Chen ◽

Bingquan Wang ◽

Jiaxin Zhang ◽

Na Li ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Relation Extraction ◽

Attention Mechanism ◽

Entity Extraction ◽

Relation Type ◽

Benchmark Datasets ◽

Relation Learning

Joint entity and relation extraction is critical for many natural language processing (NLP) tasks, which has attracted increasing research interest. However, it is still faced with the challenges of identifying the overlapping relation triplets along with the entire entity boundary and detecting the multi-type relations. In this paper, we propose an attention-based joint model, which mainly contains an entity extraction module and a relation detection module, to address the challenges. The key of our model is devising a supervised multi-head self-attention mechanism as the relation detection module to learn the token-level correlation for each relation type separately. With the attention mechanism, our model can effectively identify overlapping relations and flexibly predict the relation type with its corresponding intensity. To verify the effectiveness of our model, we conduct comprehensive experiments on two benchmark datasets. The experimental results demonstrate that our model achieves state-of-the-art performances.

Download Full-text

Graph Contextualized Self-Attention Network for Session-based Recommendation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/547 ◽

2019 ◽

Cited By ~ 20

Author(s):

Chengfeng Xu ◽

Pengpeng Zhao ◽

Yanchi Liu ◽

Victor S. Sheng ◽

Jiajie Xu ◽

...

Keyword(s):

Neural Network ◽

State Of The Art ◽

Attention Mechanism ◽

The Self ◽

Graph Structure ◽

Convolutional Network ◽

Attention Network ◽

Attention Model ◽

Sequence Modeling ◽

Real World Datasets

Session-based recommendation, which aims to predict the user's immediate next action based on anonymous sessions, is a key task in many online services (e.g., e-commerce, media streaming). Recently, Self-Attention Network (SAN) has achieved significant success in various sequence modeling tasks without using either recurrent or convolutional network. However, SAN lacks local dependencies that exist over adjacent items and limits its capacity for learning contextualized representations of items in sequences. In this paper, we propose a graph contextualized self-attention model (GC-SAN), which utilizes both graph neural network and self-attention mechanism, for session-based recommendation. In GC-SAN, we dynamically construct a graph structure for session sequences and capture rich local dependencies via graph neural network (GNN). Then each session learns long-range dependencies by applying the self-attention mechanism. Finally, each session is represented as a linear combination of the global preference and the current interest of that session. Extensive experiments on two real-world datasets show that GC-SAN outperforms state-of-the-art methods consistently.

Download Full-text

MKPM: Multi keyword-pair matching for natural language sentences

Applied Intelligence ◽

10.1007/s10489-021-02306-5 ◽

2021 ◽

Author(s):

Xin Lu ◽

Yao Deng ◽

Ting Sun ◽

Yi Gao ◽

Jun Feng ◽

...

Keyword(s):

Natural Language ◽

Word Pair ◽

Question Answering ◽

Semantic Relationship ◽

Sentence Pair ◽

Word Level ◽

Sentence Level ◽

Pair Matching ◽

Task Architecture ◽

Sentence Matching

AbstractSentence matching is widely used in various natural language tasks, such as natural language inference, paraphrase identification and question answering. For these tasks, we need to understand the logical and semantic relationship between two sentences. Most current methods use all information within a sentence to build a model and hence determine its relationship to another sentence. However, the information contained in some sentences may cause redundancy or introduce noise, impeding the performance of the model. Therefore, we propose a sentence matching method based on multi keyword-pair matching (MKPM), which uses keyword pairs in two sentences to represent the semantic relationship between them, avoiding the interference of redundancy and noise. Specifically, we first propose a sentence-pair-based attention mechanism sp-attention to select the most important word pair from the two sentences as a keyword pair, and then propose a Bi-task architecture to model the semantic information of these keyword pairs. The Bi-task architecture is as follows: 1. In order to understand the semantic relationship at the word level between two sentences, we design a word-pair task (WP-Task), which uses these keyword pairs to complete sentence matching independently. 2. We design a sentence-pair task (SP-Task) to understand the sentence level semantic relationship between the two sentences by sentence denoising. Through the integration of the two tasks, our model can understand sentences more accurately from the two granularities of word and sentence. Experimental results show that our model can achieve state-of-the-art performance in several tasks. Our source code is publicly available1.

Download Full-text

SE4ExSum: An Integrated Semantic-aware Neural Approach with Graph Convolutional Network for Extractive Text Summarization

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3464426 ◽

2021 ◽

Vol 20 (6) ◽

pp. 1-22

Author(s):

Tham Vo

Keyword(s):

Neural Network ◽

Deep Learning ◽

State Of The Art ◽

Text Summarization ◽

Text Representation ◽

Convolutional Network ◽

Sentence Level ◽

Proposed Model ◽

Benchmark Datasets ◽

Common Problems

Recently, advanced techniques in deep learning such as recurrent neural network (GRU, LSTM and Bi-LSTM) and auto-encoding (attention-based transformer and BERT) have achieved great successes in multiple application domains including text summarization. Recent state-of-the-art encoding-based text summarization models such as BertSum, PreSum and DiscoBert have demonstrated significant improvements on extractive text summarization tasks. However, recent models still encounter common problems related to the language-specific dependency which requires the supports of the external NLP tools. Besides that, recent advanced text representation methods, such as BERT as the sentence-level textual encoder, also fail to fully capture the representation of a full-length document. To address these challenges, in this paper we proposed a novel s emantic-ware e mbedding approach for ex tractive text sum marization , called as: SE4ExSum. Our proposed SE4ExSum is an integration between the use of feature graph-of-words (FGOW) with BERT-based encoder for effectively learning the word/sentence-level representations of a given document. Then, the g raph c onvolutional n etwork (GCN) based encoder is applied to learn the global document's representation which is then used to facilitate the text summarization task. Extensive experiments on benchmark datasets show the effectiveness of our proposed model in comparing with recent state-of-the-art text summarization models.

Download Full-text

Large-scale Semantic Parsing without Question-Answer Pairs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00190 ◽

2014 ◽

Vol 2 ◽

pp. 377-392 ◽

Cited By ~ 40

Author(s):

Siva Reddy ◽

Mirella Lapata ◽

Mark Steedman

Keyword(s):

Natural Language ◽

Large Scale ◽

Graph Matching ◽

State Of The Art ◽

The State ◽

Semantic Parsing ◽

Matching Problem ◽

Weak Supervision ◽

Benchmark Datasets

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.

Download Full-text

A Novel Bidirectional LSTM and Attention Mechanism based Neural Network for Answer Selection in Community Question Answering

Computers Materials & Continua ◽

10.32604/cmc.2020.07269 ◽

2020 ◽

Vol 62 (3) ◽

pp. 1273-1288

Author(s):

Zhang Bo ◽

Wang Haowen ◽

Jiang Longquan ◽

Yuan Shuhan ◽

Li Meizi

Keyword(s):

Neural Network ◽

Question Answering ◽

Attention Mechanism ◽

Community Question Answering ◽

Bidirectional Lstm

Download Full-text