Asynchronous Multi-grained Graph Network For Interpretable Multi-hop Reading Comprehension

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/531 ◽

2021 ◽

Author(s):

Ronghan Li ◽

Lifang Wang ◽

Shengli Wang ◽

Zejun Jiang

Keyword(s):

Neural Networks ◽

Reading Comprehension ◽

Competitive Performance ◽

Proposed Model ◽

Logical Order ◽

Graph Neural Networks ◽

Machine Reading ◽

Independent Parameters ◽

Multiple Granularities ◽

Graph Nodes

Multi-hop machine reading comprehension (MRC) task aims to enable models to answer the compound question according to the bridging information. Existing methods that use graph neural networks to represent multiple granularities such as entities and sentences in documents update all nodes synchronously, ignoring the fact that multi-hop reasoning has a certain logical order across granular levels. In this paper, we introduce an Asynchronous Multi-grained Graph Network (AMGN) for multi-hop MRC. First, we construct a multigrained graph containing entity and sentence nodes. Particularly, we use independent parameters to represent relationship groups defined according to the level of granularity. Second, an asynchronous update mechanism based on multi-grained relationships is proposed to mimic human multi-hop reading logic. Besides, we present a question reformulation mechanism to update the latent representation of the compound question with updated graph nodes. We evaluate the proposed model on the HotpotQA dataset and achieve top competitive performance in distractor setting compared with other published models. Further analysis shows that the asynchronous update mechanism can effectively form interpretable reasoning chains at different granularity levels.

Download Full-text

Towards Reading Comprehension for Long Documents

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/638 ◽

2018 ◽

Author(s):

Yuanxing Zhang ◽

Yangbin Zhang ◽

Kaigui Bian ◽

Xiaoming Li

Keyword(s):

Reading Comprehension ◽

Language Comprehension ◽

Exact Match ◽

Attention Model ◽

Comprehension Task ◽

Proposed Model ◽

Short Span ◽

Machine Reading ◽

The Relationship

Machine reading comprehension has gained attention from both industry and academia. It is a very challenging task that involves various domains such as language comprehension, knowledge inference, summarization, etc. Previous studies mainly focus on reading comprehension on short paragraphs, and these approaches fail to perform well on the documents. In this paper, we propose a hierarchical match attention model to instruct the machine to extract answers from a specific short span of passages for the long document reading comprehension (LDRC) task. The model takes advantages from hierarchical-LSTM to learn the paragraph-level representation, and implements the match mechanism (i.e., quantifying the relationship between two contexts) to find the most appropriate paragraph that includes the hint of answers. Then the task can be decoupled into reading comprehension task for short paragraph, such that the answer can be produced. Experiments on the modified SQuAD dataset show that our proposed model outperforms existing reading comprehension models by at least 20% regarding exact match (EM), F1 and the proportion of identified paragraphs which are exactly the short paragraphs where the original answers locate.

Download Full-text

Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016276 ◽

2019 ◽

Vol 33 ◽

pp. 6276-6283 ◽

Cited By ~ 6

Author(s):

Zhipeng Chen ◽

Yiming Cui ◽

Wentao Ma ◽

Shijin Wang ◽

Guoping Hu

Keyword(s):

Reading Comprehension ◽

Mutual Information ◽

Spatial Attention ◽

State Of The Art ◽

Multiple Choice ◽

Multiple Choice Questions ◽

Attention Model ◽

Novel Approach ◽

Proposed Model ◽

Machine Reading

Machine Reading Comprehension (MRC) with multiplechoice questions requires the machine to read given passage and select the correct answer among several candidates. In this paper, we propose a novel approach called Convolutional Spatial Attention (CSA) model which can better handle the MRC with multiple-choice questions. The proposed model could fully extract the mutual information among the passage, question, and the candidates, to form the enriched representations. Furthermore, to merge various attention results, we propose to use convolutional operation to dynamically summarize the attention values within the different size of regions. Experimental results show that the proposed model could give substantial improvements over various state-of- the-art systems on both RACE and SemEval-2018 Task11 datasets.

Download Full-text

Rception: Wide and Deep Interaction Networks for Machine Reading Comprehension (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7266 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13987-13988

Author(s):

Xuanyu Zhang ◽

Zhichun Wang

Keyword(s):

Neural Networks ◽

Reading Comprehension ◽

Convolutional Neural Networks ◽

Recurrent Neural Networks ◽

Question Answering ◽

Local Information ◽

Interaction Networks ◽

Layer By Layer ◽

Global Context ◽

Machine Reading

Most of models for machine reading comprehension (MRC) usually focus on recurrent neural networks (RNNs) and attention mechanism, though convolutional neural networks (CNNs) are also involved for time efficiency. However, little attention has been paid to leverage CNNs and RNNs in MRC. For a deeper understanding, humans sometimes need local information for short phrases, sometimes need global context for long passages. In this paper, we propose a novel architecture, i.e., Rception, to capture and leverage both local deep information and global wide context. It fuses different kinds of networks and hyper-parameters horizontally rather than simply stacking them layer by layer vertically. Experiments on the Stanford Question Answering Dataset (SQuAD) show that our proposed architecture achieves good performance.

Download Full-text

Graph Few-Shot Learning via Knowledge Transfer

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6142 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6656-6663 ◽

Cited By ~ 4

Author(s):

Huaxiu Yao ◽

Chuxu Zhang ◽

Ying Wei ◽

Meng Jiang ◽

Suhang Wang ◽

...

Keyword(s):

Neural Networks ◽

Receptive Field ◽

Classification Accuracy ◽

Structural Knowledge ◽

Challenging Problem ◽

Satisfactory Performance ◽

Proposed Model ◽

Node Classification ◽

Graph Neural Networks ◽

Embedding Function

Towards the challenging problem of semi-supervised node classification, there have been extensive studies. As a frontier, Graph Neural Networks (GNNs) have aroused great interest recently, which update the representation of each node by aggregating information of its neighbors. However, most GNNs have shallow layers with a limited receptive field and may not achieve satisfactory performance especially when the number of labeled nodes is quite small. To address this challenge, we innovatively propose a graph few-shot learning (GFL) algorithm that incorporates prior knowledge learned from auxiliary graphs to improve classification accuracy on the target graph. Specifically, a transferable metric space characterized by a node embedding and a graph-specific prototype embedding function is shared between auxiliary graphs and the target, facilitating the transfer of structural knowledge. Extensive experiments and ablation studies on four real-world graph datasets demonstrate the effectiveness of our proposed model and the contribution of each component.

Download Full-text

Enhancement of Target-Oriented Opinion Words Extraction with Multiview-Trained Machine Reading Comprehension Model

Computational Intelligence and Neuroscience ◽

10.1155/2021/6645871 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Jingyuan Zhang ◽

Zequn Zhang ◽

Zhi Guo ◽

Li Jin ◽

Kang Liu ◽

...

Keyword(s):

Reading Comprehension ◽

Question Answering ◽

Opinion Mining ◽

Common Knowledge ◽

Multiple Perspectives ◽

Fine Grained ◽

Proposed Model ◽

Meta Learning ◽

Benchmark Datasets ◽

Machine Reading

Target-oriented opinion words extraction (TOWE) seeks to identify opinion expressions oriented to a specific target, and it is a crucial step toward fine-grained opinion mining. Recent neural networks have achieved significant success in this task by building target-aware representations. However, there are still two limitations of these methods that hinder the progress of TOWE. Mainstream approaches typically utilize position indicators to mark the given target, which is a naive strategy and lacks task-specific semantic meaning. Meanwhile, the annotated target-opinion pairs contain rich latent structural knowledge from multiple perspectives, but existing methods only exploit the TOWE view. To tackle these issues, we formulate the TOWE task as a question answering (QA) problem and leverage a machine reading comprehension (MRC) model trained with a multiview paradigm to extract targeted opinions. Specifically, we introduce a template-based pseudo-question generation method and utilize deep attention interaction to build target-aware context representations and extract related opinion words. To take advantage of latent structural correlations, we further cast the opinion-target structure into three distinct yet correlated views and leverage meta-learning to aggregate common knowledge among them to enhance the TOWE task. We evaluate the proposed model on four benchmark datasets, and our method achieves new state-of-the-art results. Extensional experiments have shown that the pipeline method with our approach could surpass existing opinion pair extraction models, including joint methods that are usually believed to work better.

Download Full-text

Graph Transformer for Graph-to-Sequence Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6243 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7464-7471

Author(s):

Deng Cai ◽

Wai Lam

Keyword(s):

Neural Networks ◽

Information Exchange ◽

State Of The Art ◽

Structural Information ◽

Representation Learning ◽

Graph Representation ◽

Text Generation ◽

Proposed Model ◽

Graph Neural Networks ◽

Meaning Representation

The dominant graph-to-sequence transduction models employ graph neural networks for graph representation learning, where the structural information is reflected by the receptive field of neurons. Unlike graph neural networks that restrict the information exchange between immediate neighborhood, we propose a new model, known as Graph Transformer, that uses explicit relation encoding and allows direct communication between two distant nodes. It provides a more efficient way for global graph structure modeling. Experiments on the applications of text generation from Abstract Meaning Representation (AMR) and syntax-based neural machine translation show the superiority of our proposed model. Specifically, our model achieves 27.4 BLEU on LDC2015E86 and 29.7 BLEU on LDC2017T10 for AMR-to-text generation, outperforming the state-of-the-art results by up to 2.2 points. On the syntax-based translation tasks, our model establishes new single-model state-of-the-art BLEU scores, 21.3 for English-to-German and 14.1 for English-to-Czech, improving over the existing best results, including ensembles, by over 1 BLEU.

Download Full-text

Teaching Machines to Extract Main Content for Machine Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019973 ◽

2019 ◽

Vol 33 ◽

pp. 9973-9974

Author(s):

Zhaohui Li ◽

Yue Feng ◽

Jun Xu ◽

Jiafeng Guo ◽

Yanyan Lan ◽

...

Keyword(s):

Reading Comprehension ◽

Sequential Decision Making ◽

Model Parameters ◽

Sequential Decision ◽

Proposed Model ◽

Policy Gradient ◽

Markov Decision ◽

Content Identification ◽

Machine Reading ◽

Teaching Machines

Machine reading comprehension, whose goal is to find answers from the candidate passages for a given question, has attracted a lot of research efforts in recent years. One of the key challenge in machine reading comprehension is how to identify the main content from a large, redundant, and overlapping set of candidate sentences. In this paper we propose to tackle the challenge with Markov Decision Process in which the main content identification is formalized as sequential decision making and each action corresponds to selecting a sentence. Policy gradient is used to learn the model parameters. Experimental results based on MSMARCO showed that the proposed model, called MC-MDP, can select high quality main contents and significantly improved the performances of answer span prediction.

Download Full-text

Multi-Matching Network for Multiple Choice Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017088 ◽

2019 ◽

Vol 33 ◽

pp. 7088-7095 ◽

Cited By ~ 2

Author(s):

Min Tang ◽

Jiaran Cai ◽

Hankz Hankui Zhuo

Keyword(s):

Reading Comprehension ◽

Correct Answer ◽

Large Scale ◽

State Of The Art ◽

Multiple Choice ◽

Semantic Relationship ◽

Matching Network ◽

Empirical Results ◽

Proposed Model ◽

Machine Reading

Multiple-choice machine reading comprehension is an important and challenging task where the machine is required to select the correct answer from a set of candidate answers given passage and question. Existing approaches either match extracted evidence with candidate answers shallowly or model passage, question and candidate answers with a single paradigm of matching. In this paper, we propose Multi-Matching Network (MMN) which models the semantic relationship among passage, question and candidate answers from multiple different paradigms of matching. In our MMN model, each paradigm is inspired by how human think and designed under a unified compose-match framework. To demonstrate the effectiveness of our model, we evaluate MMN on a large-scale multiple choice machine reading comprehension dataset (i.e. RACE). Empirical results show that our proposed model achieves a significant improvement compared to strong baselines and obtains state-of-the-art results.

Download Full-text

A Multiple-Choice Machine Reading Comprehension Model with Multi-Granularity Semantic Reasoning

Applied Sciences ◽

10.3390/app11177945 ◽

2021 ◽

Vol 11 (17) ◽

pp. 7945

Author(s):

Yu Dai ◽

Yufan Fu ◽

Lei Yang

Keyword(s):

Reading Comprehension ◽

Semantic Information ◽

Multiple Choice ◽

Global Information ◽

Reasoning Ability ◽

Semantic Reasoning ◽

Benchmark Model ◽

Proposed Model ◽

Convolution Kernels ◽

Machine Reading

To address the problem of poor semantic reasoning of models in multiple-choice Chinese machine reading comprehension (MRC), this paper proposes an MRC model incorporating multi-granularity semantic reasoning. In this work, we firstly encode articles, questions and candidates to extract global reasoning information; secondly, we use multiple convolution kernels of different sizes to convolve and maximize pooling of the BERT-encoded articles, questions and candidates to extract local semantic reasoning information of different granularities; we then fuse the global information with the local multi-granularity information and use it to make an answer selection. The proposed model can combine the learned multi-granularity semantic information for reasoning, solving the problem of poor semantic reasoning ability of the model, and thus can improve the reasoning ability of machine reading comprehension. The experiments show that the proposed model achieves better performance on the C3 dataset than the benchmark model in semantic reasoning, which verifies the effectiveness of the proposed model in semantic reasoning.

Download Full-text

Document Re-Ranking Model for Machine-Reading and Comprehension

Applied Sciences ◽

10.3390/app10217547 ◽

2020 ◽

Vol 10 (21) ◽

pp. 7547

Author(s):

Youngjin Jang ◽

Harksoo Kim

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

High Performance ◽

Language Model ◽

Text Retrieval ◽

Retrieval Models ◽

Similarity Network ◽

Proposed Model ◽

Ranking Model ◽

Machine Reading

Recently, the performance of machine-reading and comprehension (MRC) systems has been significantly enhanced. However, MRC systems require high-performance text retrieval models because text passages containing answer phrases should be prepared in advance. To improve the performance of text retrieval models underlying MRC systems, we propose a re-ranking model, based on artificial neural networks, that is composed of a query encoder, a passage encoder, a phrase modeling layer, an attention layer, and a similarity network. The proposed model learns degrees of associations between queries and text passages through dot products between phrases that constitute questions and passages. In experiments with the MS-MARCO dataset, the proposed model demonstrated higher mean reciprocal ranks (MRRs), 0.8%p–13.2%p, than most of the previous models, except for the models based on BERT (a pre-trained language model). Although the proposed model demonstrated lower MRRs than the BERT-based models, it was approximately 8 times lighter and 3.7 times faster than the BERT-based models.

Download Full-text