Superimposed Attention Mechanism-Based CNN Network for Reading Comprehension and Question Answering

Automatic question generation can benefit many applications ranging from dialogue systems to reading comprehension. While questions are often asked with respect to long documents, there are many challenges with modeling such long documents. Many existing techniques generate questions by effectively looking at one sentence at a time, leading to questions that are easy and not reflective of the human process of question generation. Our goal is to incorporate interactions across multiple sentences to generate realistic questions for long documents. In order to link a broad document context to the target answer, we represent the relevant context via a multi-stage attention mechanism, which forms the foundation of a sequence to sequence model. We outperform state-of-the-art methods on question generation on three question-answering datasets - SQuAD, MS MARCO and NewsQA. 1

Download Full-text

Extraction of Question-related Sentences for Reading Comprehension Tests via Attention Mechanism

2020 International Conference on Technologies and Applications of Artificial Intelligence (TAAI) ◽

10.1109/taai51410.2020.00013 ◽

2020 ◽

Author(s):

Junjie Shan ◽

Yoko Nishihara ◽

Akira Maeda ◽

Ryosuke Yamanishi

Keyword(s):

Reading Comprehension ◽

Attention Mechanism ◽

Comprehension Tests ◽

Reading Comprehension Tests

Download Full-text

A Method Based on Attention Mechanism using Bidirectional Long-Short Term Memory(BLSTM) for Question Answering

10.1109/icee52715.2021.9544258 ◽

2021 ◽

Author(s):

Seyed Vahid Moravvej ◽

Mohammad Javad Maleki Kahaki ◽

Moein Salimi Sartakhti ◽

Abdolreza Mirzaei

Keyword(s):

Question Answering ◽

Short Term Memory ◽

Attention Mechanism ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

Translucent Answer Predictions in Multi-Hop Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6272 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7700-7707

Author(s):

G P Shrivatsa Bhargav ◽

Michael Glass ◽

Dinesh Garg ◽

Shirish Shevade ◽

Saswati Dana ◽

...

Keyword(s):

Reading Comprehension ◽

Question Answering ◽

State Of The Art ◽

Local Context ◽

The Novel ◽

Loose Coupling ◽

Loosely Coupled ◽

Neural Architecture ◽

Coupled Networks ◽

Novel Design

Research on the task of Reading Comprehension style Question Answering (RCQA) has gained momentum in recent years due to the emergence of human annotated datasets and associated leaderboards, for example CoQA, HotpotQA, SQuAD, TriviaQA, etc. While state-of-the-art has advanced considerably, there is still ample opportunity to advance it further on some important variants of the RCQA task. In this paper, we propose a novel deep neural architecture, called TAP (Translucent Answer Prediction), to identify answers and evidence (in the form of supporting facts) in an RCQA task requiring multi-hop reasoning. TAP comprises two loosely coupled networks – Local and Global Interaction eXtractor (LoGIX) and Answer Predictor (AP). LoGIX predicts supporting facts, whereas AP consumes these predicted supporting facts to predict the answer span. The novel design of LoGIX is inspired by two key design desiderata – local context and global interaction– that we identified by analyzing examples of multi-hop RCQA task. The loose coupling between LoGIX and the AP reveals the set of sentences used by the AP in predicting an answer. Therefore, answer predictions of TAP can be interpreted in a translucent manner. TAP offers state-of-the-art performance on the HotpotQA (Yang et al. 2018) dataset – an apt dataset for multi-hop RCQA task – as it occupies Rank-1 on its leaderboard (https://hotpotqa.github.io/) at the time of submission.

Download Full-text

A Novel Bidirectional LSTM and Attention Mechanism based Neural Network for Answer Selection in Community Question Answering

Computers Materials & Continua ◽

10.32604/cmc.2020.07269 ◽

2020 ◽

Vol 62 (3) ◽

pp. 1273-1288

Author(s):

Zhang Bo ◽

Wang Haowen ◽

Jiang Longquan ◽

Yuan Shuhan ◽

Li Meizi

Keyword(s):

Neural Network ◽

Question Answering ◽

Attention Mechanism ◽

Community Question Answering ◽

Bidirectional Lstm

Download Full-text

Evaluating Commonsense in Pre-Trained Language Models

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6523 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9733-9740 ◽

Cited By ~ 1

Author(s):

Xuhui Zhou ◽

Yue Zhang ◽

Leyang Cui ◽

Dandan Huang

Keyword(s):

Reading Comprehension ◽

Question Answering ◽

Deep Level ◽

Language Models ◽

Future Research ◽

Correct Prediction ◽

Test Cases ◽

Word Sense ◽

Training Set ◽

Text Data

Contextualized representations trained over large raw text data have given remarkable improvements for NLP tasks including question answering and reading comprehension. There have been works showing that syntactic, semantic and word sense knowledge are contained in such representations, which explains why they benefit such tasks. However, relatively little work has been done investigating commonsense knowledge contained in contextualized representations, which is crucial for human question answering and reading comprehension. We study the commonsense ability of GPT, BERT, XLNet, and RoBERTa by testing them on seven challenging benchmarks, finding that language modeling and its variants are effective objectives for promoting models' commonsense ability while bi-directional context and larger training set are bonuses. We additionally find that current models do poorly on tasks require more necessary inference steps. Finally, we test the robustness of models by making dual test cases, which are correlated so that the correct prediction of one sample should lead to correct prediction of the other. Interestingly, the models show confusion on these test cases, which suggests that they learn commonsense at the surface rather than the deep level. We release a test set, named CATs publicly, for future research.

Download Full-text

Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets

Wireless Communications and Mobile Computing ◽

10.1155/2021/5375334 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Changchang Zeng ◽

Shaobo Li

Keyword(s):

Reading Comprehension ◽

Language Processing ◽

Question Answering ◽

Multiple Choice ◽

Length Distribution ◽

Research Field ◽

Evaluation Framework ◽

Language Models ◽

Training Objective ◽

Machine Reading

Machine reading comprehension (MRC) is a challenging natural language processing (NLP) task. It has a wide application potential in the fields of question answering robots, human-computer interactions in mobile virtual reality systems, etc. Recently, the emergence of pretrained models (PTMs) has brought this research field into a new era, in which the training objective plays a key role. The masked language model (MLM) is a self-supervised training objective widely used in various PTMs. With the development of training objectives, many variants of MLM have been proposed, such as whole word masking, entity masking, phrase masking, and span masking. In different MLMs, the length of the masked tokens is different. Similarly, in different machine reading comprehension tasks, the length of the answer is also different, and the answer is often a word, phrase, or sentence. Thus, in MRC tasks with different answer lengths, whether the length of MLM is related to performance is a question worth studying. If this hypothesis is true, it can guide us on how to pretrain the MLM with a relatively suitable mask length distribution for MRC tasks. In this paper, we try to uncover how much of MLM’s success in the machine reading comprehension tasks comes from the correlation between masking length distribution and answer length in the MRC dataset. In order to address this issue, herein, (1) we propose four MRC tasks with different answer length distributions, namely, the short span extraction task, long span extraction task, short multiple-choice cloze task, and long multiple-choice cloze task; (2) four Chinese MRC datasets are created for these tasks; (3) we also have pretrained four masked language models according to the answer length distributions of these datasets; and (4) ablation experiments are conducted on the datasets to verify our hypothesis. The experimental results demonstrate that our hypothesis is true. On four different machine reading comprehension datasets, the performance of the model with correlation length distribution surpasses the model without correlation.

Download Full-text

Importance of the Single-Span Task Formulation to Extractive Question-answering

10.5121/csit.2020.101809 ◽

2020 ◽

Author(s):

Marie-Anne Xu ◽

Rahul Khanna

Keyword(s):

Reading Comprehension ◽

Future Development ◽

Recent Progress ◽

Question Answering ◽

Span Task ◽

Consistent Performance ◽

Machine Reading

Recent progress in machine reading comprehension and question-answering has allowed machines to reach and even surpass human question-answering. However, the majority of these questions have only one answer, and more substantial testing on questions with multiple answers, or multi-span questions, has not yet been applied. Thus, we introduce a newly compiled dataset consisting of questions with multiple answers that originate from previously existing datasets. In addition, we run BERT-based models pre-trained for question-answering on our constructed dataset to evaluate their reading comprehension abilities. Among the three of BERT-based models we ran, RoBERTa exhibits the highest consistent performance, regardless of size. We find that all our models perform similarly on this new, multi-span dataset (21.492% F1) compared to the single-span source datasets (~33.36% F1). While the models tested on the source datasets were slightly fine-tuned, performance is similar enough to judge that task formulation does not drastically affect question-answering abilities. Our evaluations indicate that these models are indeed capable of adjusting to answer questions that require multiple answers. We hope that our findings will assist future development in questionanswering and improve existing question-answering products and methods.

Download Full-text

Quantum-inspired Hierarchical Attention Mechanism for Question Answering

10.1109/ictai52525.2021.00151 ◽

2021 ◽

Author(s):

Peng Guo

Keyword(s):

Question Answering ◽

Attention Mechanism

Download Full-text

Superimposed Attention Mechanism-Based CNN Network for Reading Comprehension and Question Answering

A Reading Comprehension Style Question Answering Model Based On Attention Mechanism

Capturing Greater Context for Question Generation

Extraction of Question-related Sentences for Reading Comprehension Tests via Attention Mechanism

A Method Based on Attention Mechanism using Bidirectional Long-Short Term Memory(BLSTM) for Question Answering

Translucent Answer Predictions in Multi-Hop Reading Comprehension

A Novel Bidirectional LSTM and Attention Mechanism based Neural Network for Answer Selection in Community Question Answering

Evaluating Commonsense in Pre-Trained Language Models

Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets

Importance of the Single-Span Task Formulation to Extractive Question-answering

Quantum-inspired Hierarchical Attention Mechanism for Question Answering

Export Citation Format