scholarly journals A Gated Dilated Convolution with Attention Model for Clinical Cloze-Style Reading Comprehension

Author(s):  
Bin Wang ◽  
Xuejie Zhang ◽  
Xiaobing Zhou ◽  
Junyi Li

The machine comprehension research of clinical medicine has great potential value in practical application, but it has not received sufficient attention and many existing models are very time consuming for the cloze-style machine reading comprehension. In this paper, we study the cloze-style machine reading comprehension in the clinical medical field and propose a Gated Dilated Convolution with Attention (GDCA) model, which consists of a gated dilated convolution module and an attention mechanism. Our model has high parallelism and is capable of capturing long-distance dependencies. On the CliCR data set, our model surpasses the present best model on several metrics and obtains state-of-the-art result, and the training speed is 8 times faster than that of the best model.

Author(s):  
Zhipeng Chen ◽  
Yiming Cui ◽  
Wentao Ma ◽  
Shijin Wang ◽  
Guoping Hu

Machine Reading Comprehension (MRC) with multiplechoice questions requires the machine to read given passage and select the correct answer among several candidates. In this paper, we propose a novel approach called Convolutional Spatial Attention (CSA) model which can better handle the MRC with multiple-choice questions. The proposed model could fully extract the mutual information among the passage, question, and the candidates, to form the enriched representations. Furthermore, to merge various attention results, we propose to use convolutional operation to dynamically summarize the attention values within the different size of regions. Experimental results show that the proposed model could give substantial improvements over various state-of- the-art systems on both RACE and SemEval-2018 Task11 datasets.


2020 ◽  
Vol 34 (05) ◽  
pp. 8783-8790 ◽  
Author(s):  
Ling Min Serena Khoo ◽  
Hai Leong Chieu ◽  
Zhong Qian ◽  
Jing Jiang

We address rumor detection by learning to differentiate between the community's response to real and fake claims in microblogs. Existing state-of-the-art models are based on tree models that model conversational trees. However, in social media, a user posting a reply might be replying to the entire thread rather than to a specific user. We propose a post-level attention model (PLAN) to model long distance interactions between tweets with the multi-head attention mechanism in a transformer network. We investigated variants of this model: (1) a structure aware self-attention model (StA-PLAN) that incorporates tree structure information in the transformer network, and (2) a hierarchical token and post-level attention model (StA-HiTPLAN) that learns a sentence representation with token-level self-attention. To the best of our knowledge, we are the first to evaluate our models on two rumor detection data sets: the PHEME data set as well as the Twitter15 and Twitter16 data sets. We show that our best models outperform current state-of-the-art models for both data sets. Moreover, the attention mechanism allows us to explain rumor detection predictions at both token-level and post-level.


2020 ◽  
Vol 29 (01n02) ◽  
pp. 1950010
Author(s):  
Liguo Duan ◽  
Jianying Gao ◽  
Aiping Li

The Multi-choice machine reading comprehension, selecting the correct answer in the candidate answers, requires obtaining the interaction semantics between the given passage and the question. In this paper, we propose an end-to-end deep learning model. It employs Bi-GRU to contextually encode passages and question, and specifically models complex interactions between the given passage and the question by six kinds of attention functions, including the concatenated attention, the bilinear attention, the element-wise dot attention, minus attention and bi-directional attentions of Query2Context, Context2Query. Then, we use the multi-level attention transfer reasoning mechanism to focus on further obtaining more accurate comprehensive semantics. To demonstrate the validity of our model, we performed experiments on the large reading comprehension data set RACE. The experimental results show that our model surpasses many state-of-the-art systems on the RACE data set and has good reasoning ability.


Author(s):  
Yuanxing Zhang ◽  
Yangbin Zhang ◽  
Kaigui Bian ◽  
Xiaoming Li

Machine reading comprehension has gained attention from both industry and academia. It is a very challenging task that involves various domains such as language comprehension, knowledge inference, summarization, etc. Previous studies mainly focus on reading comprehension on short paragraphs, and these approaches fail to perform well on the documents. In this paper, we propose a hierarchical match attention model to instruct the machine to extract answers from a specific short span of passages for the long document reading comprehension (LDRC) task. The model takes advantages from hierarchical-LSTM to learn the paragraph-level representation, and implements the match mechanism (i.e., quantifying the relationship between two contexts) to find the most appropriate paragraph that includes the hint of answers. Then the task can be decoupled into reading comprehension task for short paragraph, such that the answer can be produced. Experiments on the modified SQuAD dataset show that our proposed model outperforms existing reading comprehension models by at least 20% regarding exact match (EM), F1 and the proportion of identified paragraphs which are exactly the short paragraphs where the original answers locate.


Author(s):  
Minghao Hu ◽  
Furu Wei ◽  
Yuxing Peng ◽  
Zhen Huang ◽  
Nan Yang ◽  
...  

Machine reading comprehension with unanswerable questions aims to abstain from answering when no answer can be inferred. In addition to extract answers, previous works usually predict an additional “no-answer” probability to detect unanswerable cases. However, they fail to validate the answerability of the question by verifying the legitimacy of the predicted answer. To address this problem, we propose a novel read-then-verify system, which not only utilizes a neural reader to extract candidate answers and produce no-answer probabilities, but also leverages an answer verifier to decide whether the predicted answer is entailed by the input snippets. Moreover, we introduce two auxiliary losses to help the reader better handle answer extraction as well as no-answer detection, and investigate three different architectures for the answer verifier. Our experiments on the SQuAD 2.0 dataset show that our system obtains a score of 74.2 F1 on test set, achieving state-of-the-art results at the time of submission (Aug. 28th, 2018).


2018 ◽  
Vol 232 ◽  
pp. 02047
Author(s):  
Hui Xu ◽  
Shichang Zhang ◽  
Jie Jiang

Machine Reading Comprehension (MRC) refers to the task that aims to read the context through the machine and answer the question about the original text, which needs to be modeled in the interaction between the context and the question. Recently, attention mechanisms in deep learning have been successfully extended to MRC tasks. In general, the attention-based approach is to focus attention on a small part of the context and to generalize it using a fixed-size vector. This paper introduces a network of attention from coarse to fine, which is a multi-stage hierarchical process. Firstly, the context and questions are encoded by bi-directional LSTM RNN; Then, more accurate interaction information is obtained after multiple iterations of the attention mechanism; Finally, a cursor-based approach is used to predicts the answer at the beginning and end of the original text. Experimental evaluation of shows that the BiDMF (Bi-Directional Multi-Attention Flow) model designed in this paper achieved 34.1% BLUE4 value and 39.5% Rouge-L value on the test set.


Author(s):  
Shuangzhi Wu ◽  
Ming Zhou ◽  
Dongdong Zhang

Neural Machine Translation (NMT) based on the encoder-decoder architecture has recently achieved the state-of-the-art performance. Researchers have proven that extending word level attention to phrase level attention by incorporating source-side phrase structure can enhance the attention model and achieve promising improvement. However, word dependencies that can be crucial to correctly understand a source sentence are not always in a consecutive fashion (i.e. phrase structure), sometimes they can be in long distance. Phrase structures are not the best way to explicitly model long distance dependencies. In this paper we propose a simple but effective method to incorporate source-side long distance dependencies into NMT. Our method based on dependency trees enriches each source state with global dependency structures, which can better capture the inherent syntactic structure of source sentences. Experiments on Chinese-English and English-Japanese translation tasks show that our proposed method outperforms state-of-the-art SMT and NMT baselines.


Author(s):  
Min Tang ◽  
Jiaran Cai ◽  
Hankz Hankui Zhuo

Multiple-choice machine reading comprehension is an important and challenging task where the machine is required to select the correct answer from a set of candidate answers given passage and question. Existing approaches either match extracted evidence with candidate answers shallowly or model passage, question and candidate answers with a single paradigm of matching. In this paper, we propose Multi-Matching Network (MMN) which models the semantic relationship among passage, question and candidate answers from multiple different paradigms of matching. In our MMN model, each paradigm is inspired by how human think and designed under a unified compose-match framework. To demonstrate the effectiveness of our model, we evaluate MMN on a large-scale multiple choice machine reading comprehension dataset (i.e. RACE). Empirical results show that our proposed model achieves a significant improvement compared to strong baselines and obtains state-of-the-art results.


Author(s):  
Minghao Hu ◽  
Yuxing Peng ◽  
Zhen Huang ◽  
Xipeng Qiu ◽  
Furu Wei ◽  
...  

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.


2021 ◽  
pp. 1-19
Author(s):  
Kiet Van Nguyen ◽  
Nhat Duy Nguyen ◽  
Phong Nguyen-Thuan Do ◽  
Anh Gia-Tuan Nguyen ◽  
Ngan Luu-Thuy Nguyen

Machine Reading Comprehension has attracted significant interest in research on natural language understanding, and large-scale datasets and neural network-based methods have been developed for this task. However, most developments of resources and methods in machine reading comprehension have been investigated using two resource-rich languages, English and Chinese. This article proposes a system called ViReader for open-domain machine reading comprehension in Vietnamese by using Wikipedia as the textual knowledge source, where the answer to any particular question is a textual span derived directly from texts on Vietnamese Wikipedia. Our system combines a sentence retriever component, based on techniques of information retrieval to extract the relevant sentences, with a transfer learning-based answer extractor trained to predict answers based on Wikipedia texts. Experiments on multiple datasets for machine reading comprehension in Vietnamese and other languages demonstrate that (1) our ViReader system is highly competitive with prevalent machine learning-based systems, and (2) multi-task learning by using a combination consisting of the sentence retriever and answer extractor is an end-to-end reading comprehension system. The sentence retriever component of our proposed system retrieves the sentences that are most likely to provide the answer response to the given question. The transfer learning-based answer extractor then reads the document from which the sentences have been retrieved, predicts the answer, and returns it to the user. The ViReader system achieves new state-of-the-art performances, with values of 70.83% EM (exact match) and 89.54% F1, outperforming the BERT-based system by 11.55% and 9.54% , respectively. It also obtains state-of-the-art performance on UIT-ViNewsQA (another Vietnamese dataset consisting of online health-domain news) and BiPaR (a bilingual dataset on English and Chinese novel texts). Compared with the BERT-based system, our system achieves significant improvements (in terms of F1) with 7.65% for English and 6.13% for Chinese on the BiPaR dataset. Furthermore, we build a ViReader application programming interface that programmers can employ in Artificial Intelligence applications.


Sign in / Sign up

Export Citation Format

Share Document