Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016276 ◽

2019 ◽

Vol 33 ◽

pp. 6276-6283 ◽

Cited By ~ 6

Author(s):

Zhipeng Chen ◽

Yiming Cui ◽

Wentao Ma ◽

Shijin Wang ◽

Guoping Hu

Keyword(s):

Reading Comprehension ◽

Mutual Information ◽

Spatial Attention ◽

State Of The Art ◽

Multiple Choice ◽

Multiple Choice Questions ◽

Attention Model ◽

Novel Approach ◽

Proposed Model ◽

Machine Reading

Machine Reading Comprehension (MRC) with multiplechoice questions requires the machine to read given passage and select the correct answer among several candidates. In this paper, we propose a novel approach called Convolutional Spatial Attention (CSA) model which can better handle the MRC with multiple-choice questions. The proposed model could fully extract the mutual information among the passage, question, and the candidates, to form the enriched representations. Furthermore, to merge various attention results, we propose to use convolutional operation to dynamically summarize the attention values within the different size of regions. Experimental results show that the proposed model could give substantial improvements over various state-of- the-art systems on both RACE and SemEval-2018 Task11 datasets.

Download Full-text

Multi-Matching Network for Multiple Choice Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017088 ◽

2019 ◽

Vol 33 ◽

pp. 7088-7095 ◽

Cited By ~ 2

Author(s):

Min Tang ◽

Jiaran Cai ◽

Hankz Hankui Zhuo

Keyword(s):

Reading Comprehension ◽

Correct Answer ◽

Large Scale ◽

State Of The Art ◽

Multiple Choice ◽

Semantic Relationship ◽

Matching Network ◽

Empirical Results ◽

Proposed Model ◽

Machine Reading

Multiple-choice machine reading comprehension is an important and challenging task where the machine is required to select the correct answer from a set of candidate answers given passage and question. Existing approaches either match extracted evidence with candidate answers shallowly or model passage, question and candidate answers with a single paradigm of matching. In this paper, we propose Multi-Matching Network (MMN) which models the semantic relationship among passage, question and candidate answers from multiple different paradigms of matching. In our MMN model, each paradigm is inspired by how human think and designed under a unified compose-match framework. To demonstrate the effectiveness of our model, we evaluate MMN on a large-scale multiple choice machine reading comprehension dataset (i.e. RACE). Empirical results show that our proposed model achieves a significant improvement compared to strong baselines and obtains state-of-the-art results.

Download Full-text

Towards Reading Comprehension for Long Documents

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/638 ◽

2018 ◽

Author(s):

Yuanxing Zhang ◽

Yangbin Zhang ◽

Kaigui Bian ◽

Xiaoming Li

Keyword(s):

Reading Comprehension ◽

Language Comprehension ◽

Exact Match ◽

Attention Model ◽

Comprehension Task ◽

Proposed Model ◽

Short Span ◽

Machine Reading ◽

The Relationship

Machine reading comprehension has gained attention from both industry and academia. It is a very challenging task that involves various domains such as language comprehension, knowledge inference, summarization, etc. Previous studies mainly focus on reading comprehension on short paragraphs, and these approaches fail to perform well on the documents. In this paper, we propose a hierarchical match attention model to instruct the machine to extract answers from a specific short span of passages for the long document reading comprehension (LDRC) task. The model takes advantages from hierarchical-LSTM to learn the paragraph-level representation, and implements the match mechanism (i.e., quantifying the relationship between two contexts) to find the most appropriate paragraph that includes the hint of answers. Then the task can be decoupled into reading comprehension task for short paragraph, such that the answer can be produced. Experiments on the modified SQuAD dataset show that our proposed model outperforms existing reading comprehension models by at least 20% regarding exact match (EM), F1 and the proportion of identified paragraphs which are exactly the short paragraphs where the original answers locate.

Download Full-text

The Image Annotation Refinement in Embedding Feature Space based on Mutual Information

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2022.16.23 ◽

2022 ◽

Vol 16 ◽

pp. 191-201

Author(s):

Wei Li ◽

Haiyu Song ◽

Hongda Zhang ◽

Houjie Li ◽

Pengjie Wang

Keyword(s):

Mutual Information ◽

Image Annotation ◽

State Of The Art ◽

Feature Space ◽

Semantic Space ◽

Visual Features ◽

Novel Approach ◽

Proposed Model ◽

Information Method ◽

Available Information

The ever-increasing size of images has made automatic image annotation one of the most important tasks in the fields of machine learning and computer vision. Despite continuous efforts in inventing new annotation algorithms and new models, results of the state-of-the-art image annotation methods are often unsatisfactory. In this paper, to further improve annotation refinement performance, a novel approach based on weighted mutual information to automatically refine the original annotations of images is proposed. Unlike the traditional refinement model using only visual feature, the proposed model use semantic embedding to properly map labels and visual features to a meaningful semantic space. To accurately measure the relevance between the particular image and its original annotations, the proposed model utilize all available information including image-to-image, label-to-label and image-to-label. Experimental results conducted on three typical datasets show not only the validity of the refinement, but also the superiority of the proposed algorithm over existing ones. The improvement largely benefits from our proposed mutual information method and utilizing all available information.

Download Full-text

A Multiple-Choice Machine Reading Comprehension Model with Multi-Granularity Semantic Reasoning

Applied Sciences ◽

10.3390/app11177945 ◽

2021 ◽

Vol 11 (17) ◽

pp. 7945

Author(s):

Yu Dai ◽

Yufan Fu ◽

Lei Yang

Keyword(s):

Reading Comprehension ◽

Semantic Information ◽

Multiple Choice ◽

Global Information ◽

Reasoning Ability ◽

Semantic Reasoning ◽

Benchmark Model ◽

Proposed Model ◽

Convolution Kernels ◽

Machine Reading

To address the problem of poor semantic reasoning of models in multiple-choice Chinese machine reading comprehension (MRC), this paper proposes an MRC model incorporating multi-granularity semantic reasoning. In this work, we firstly encode articles, questions and candidates to extract global reasoning information; secondly, we use multiple convolution kernels of different sizes to convolve and maximize pooling of the BERT-encoded articles, questions and candidates to extract local semantic reasoning information of different granularities; we then fuse the global information with the local multi-granularity information and use it to make an answer selection. The proposed model can combine the learned multi-granularity semantic information for reasoning, solving the problem of poor semantic reasoning ability of the model, and thus can improve the reasoning ability of machine reading comprehension. The experiments show that the proposed model achieves better performance on the C3 dataset than the benchmark model in semantic reasoning, which verifies the effectiveness of the proposed model in semantic reasoning.

Download Full-text

A Gated Dilated Convolution with Attention Model for Clinical Cloze-Style Reading Comprehension

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17041323 ◽

2020 ◽

Vol 17 (4) ◽

pp. 1323 ◽

Cited By ~ 3

Author(s):

Bin Wang ◽

Xuejie Zhang ◽

Xiaobing Zhou ◽

Junyi Li

Keyword(s):

Reading Comprehension ◽

Clinical Medicine ◽

State Of The Art ◽

Practical Application ◽

Medical Field ◽

Long Distance ◽

Data Set ◽

Attention Model ◽

Dilated Convolution ◽

Machine Reading

The machine comprehension research of clinical medicine has great potential value in practical application, but it has not received sufficient attention and many existing models are very time consuming for the cloze-style machine reading comprehension. In this paper, we study the cloze-style machine reading comprehension in the clinical medical field and propose a Gated Dilated Convolution with Attention (GDCA) model, which consists of a gated dilated convolution module and an attention mechanism. Our model has high parallelism and is capable of capturing long-distance dependencies. On the CliCR data set, our model surpasses the present best model on several metrics and obtains state-of-the-art result, and the training speed is 8 times faster than that of the best model.

Download Full-text

Multi-layer attention for person re-identification

MATEC Web of Conferences ◽

10.1051/matecconf/201927702025 ◽

2019 ◽

Vol 277 ◽

pp. 02025

Author(s):

Yuele Zhang ◽

Jie Guo ◽

Zheng Huang ◽

Weidong Qiu ◽

Hexiaohui Fan

Keyword(s):

Spatial Attention ◽

State Of The Art ◽

Metric Learning ◽

Feature Maps ◽

Factors Affecting ◽

Attention Model ◽

Proposed Model ◽

Learning Functions ◽

Surveillance Analysis ◽

Different Levels

Person re-identification has been a significant application in the field of video surveillance analysis, yet it remains a challenging work to recognize the person of interest across disjoint cameras of different viewpoints. The factors affecting the identification results include the variation in background, different illumination conditions and the changes of human body poses. Existing person re-identification methods mainly focus on the feature extraction of the whole frame and metric learning functions. However, most of those algorithms treat different areas without distinction. It is worth emphasizing that different local regions make different contributions to image representaion, which exactly conforms to the attention mechanism. In this paper, we introduce a novel attention network which explores spatial attention in a convolutional neural network. Our algorithm learns the visual attention in multi-layer feature maps. The proposed model not only pays attention to the spatial probabilities of local regions, but also takes the features in different levels into consideration. We evaluate this multi-layer spatial attention model on three benchmark person re-identification datasets: Market-1501, CUHK03, and DukeMTMC-reID. The experiment results validate the advances of our adopted network by comparing with state-of-the-art baselines.

Download Full-text

ElimiNet: A Model for Eliminating Options for Reading Comprehension with Multiple Choice Questions

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/594 ◽

2018 ◽

Cited By ~ 5

Author(s):

Soham Parikh ◽

Ananya Sai ◽

Preksha Nema ◽

Mitesh Khapra

Keyword(s):

Reading Comprehension ◽

Large Scale ◽

State Of The Art ◽

Multiple Choice ◽

Multiple Choice Questions ◽

Current State ◽

New Information ◽

Partial Elimination ◽

Maximum Similarity ◽

Option Selection

The task of Reading Comprehension with Multiple Choice Questions, requires a human (or machine) to read a given {passage, question} pair and select one of the n given options. The current state of the art model for this task first computes a question-aware representation for the passage and then selects the option which has the maximum similarity with this representation. However, when humans perform this task they do not just focus on option selection but use a combination of elimination and selection. Specifically, a human would first try to eliminate the most irrelevant option and then read the passage again in the light of this new information (and perhaps ignore portions corresponding to the eliminated option). This process could be repeated multiple times till the reader is finally ready to select the correct option. We propose ElimiNet, a neural network-based model which tries to mimic this process. Specifically, it has gates which decide whether an option can be eliminated given the {passage, question} pair and if so it tries to make the passage representation orthogonal to this eliminated option (akin to ignoring portions of the passage corresponding to the eliminated option). The model makes multiple rounds of partial elimination to refine the passage representation and finally uses a selection module to pick the best option. We evaluate our model on the recently released large scale RACE dataset and show that it outperforms the current state of the art model on 7 out of the 13 question types in this dataset. Further, we show that taking an ensemble of our elimination-selection based method with a selection based method gives us an improvement of 3.1% over the best-reported performance on this dataset.

Download Full-text

Multi-agent Attentional Activity Recognition

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/186 ◽

2019 ◽

Cited By ~ 3

Author(s):

Kaixuan Chen ◽

Lina Yao ◽

Dalin Zhang ◽

Bin Guo ◽

Zhiwen Yu

Keyword(s):

Activity Recognition ◽

State Of The Art ◽

Body Part ◽

Body Parts ◽

Temporal Attention ◽

Attention Model ◽

Proposed Model ◽

Collective Motions ◽

Multi Agent ◽

Real World Datasets

Multi-modality is an important feature of sensor based activity recognition. In this work, we consider two inherent characteristics of human activities, the spatially-temporally varying salience of features and the relations between activities and corresponding body part motions. Based on these, we propose a multi-agent spatial-temporal attention model. The spatial-temporal attention mechanism helps intelligently select informative modalities and their active periods. And the multiple agents in the proposed model represent activities with collective motions across body parts by independently selecting modalities associated with single motions. With a joint recognition goal, the agents share gained information and coordinate their selection policies to learn the optimal recognition model. The experimental results on four real-world datasets demonstrate that the proposed model outperforms the state-of-the-art methods.

Download Full-text

Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets

Wireless Communications and Mobile Computing ◽

10.1155/2021/5375334 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Changchang Zeng ◽

Shaobo Li

Keyword(s):

Reading Comprehension ◽

Language Processing ◽

Question Answering ◽

Multiple Choice ◽

Length Distribution ◽

Research Field ◽

Evaluation Framework ◽

Language Models ◽

Training Objective ◽

Machine Reading

Machine reading comprehension (MRC) is a challenging natural language processing (NLP) task. It has a wide application potential in the fields of question answering robots, human-computer interactions in mobile virtual reality systems, etc. Recently, the emergence of pretrained models (PTMs) has brought this research field into a new era, in which the training objective plays a key role. The masked language model (MLM) is a self-supervised training objective widely used in various PTMs. With the development of training objectives, many variants of MLM have been proposed, such as whole word masking, entity masking, phrase masking, and span masking. In different MLMs, the length of the masked tokens is different. Similarly, in different machine reading comprehension tasks, the length of the answer is also different, and the answer is often a word, phrase, or sentence. Thus, in MRC tasks with different answer lengths, whether the length of MLM is related to performance is a question worth studying. If this hypothesis is true, it can guide us on how to pretrain the MLM with a relatively suitable mask length distribution for MRC tasks. In this paper, we try to uncover how much of MLM’s success in the machine reading comprehension tasks comes from the correlation between masking length distribution and answer length in the MRC dataset. In order to address this issue, herein, (1) we propose four MRC tasks with different answer length distributions, namely, the short span extraction task, long span extraction task, short multiple-choice cloze task, and long multiple-choice cloze task; (2) four Chinese MRC datasets are created for these tasks; (3) we also have pretrained four masked language models according to the answer length distributions of these datasets; and (4) ablation experiments are conducted on the datasets to verify our hypothesis. The experimental results demonstrate that our hypothesis is true. On four different machine reading comprehension datasets, the performance of the model with correlation length distribution surpasses the model without correlation.

Download Full-text

Multi-Task Learning with Generative Adversarial Training for Multi-Passage Machine Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6396 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8705-8712

Author(s):

Qiyu Ren ◽

Xiang Cheng ◽

Sen Su

Keyword(s):

Reading Comprehension ◽

Open Domain ◽

Training Process ◽

Task Learning ◽

Novel Approach ◽

Adversarial Training ◽

Model Training ◽

Final Answer ◽

Machine Reading ◽

Candidate Set

Multi-passage machine reading comprehension (MRC) aims to answer a question by multiple passages. Existing multi-passage MRC approaches have shown that employing passages with and without golden answers (i.e. labeled and unlabeled passages) for model training can improve prediction accuracy. In this paper, we present MG-MRC, a novel approach for multi-passage MRC via multi-task learning with generative adversarial training. MG-MRC adopts the extract-then-select framework, where an extractor is first used to predict answer candidates, then a selector is used to choose the final answer. In MG-MRC, we adopt multi-task learning to train the extractor by using both labeled and unlabeled passages. In particular, we use labeled passages to train the extractor by supervised learning, while using unlabeled passages to train the extractor by generative adversarial training, where the extractor is regarded as the generator and a discriminator is introduced to evaluate the generated answer candidates. Moreover, to train the extractor by backpropagation in the generative adversarial training process, we propose a hybrid method which combines boundary-based and content-based extracting methods to produce the answer candidate set and its representation. The experimental results on three open-domain QA datasets confirm the effectiveness of our approach.

Download Full-text