Rception: Wide and Deep Interaction Networks for Machine Reading Comprehension (Student Abstract)

Xuanyu Zhang; Zhichun Wang

doi:10.1609/aaai.v34i10.7266

Rception: Wide and Deep Interaction Networks for Machine Reading Comprehension (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7266 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13987-13988

Author(s):

Xuanyu Zhang ◽

Zhichun Wang

Keyword(s):

Neural Networks ◽

Reading Comprehension ◽

Convolutional Neural Networks ◽

Recurrent Neural Networks ◽

Question Answering ◽

Local Information ◽

Interaction Networks ◽

Layer By Layer ◽

Global Context ◽

Machine Reading

Most of models for machine reading comprehension (MRC) usually focus on recurrent neural networks (RNNs) and attention mechanism, though convolutional neural networks (CNNs) are also involved for time efficiency. However, little attention has been paid to leverage CNNs and RNNs in MRC. For a deeper understanding, humans sometimes need local information for short phrases, sometimes need global context for long passages. In this paper, we propose a novel architecture, i.e., Rception, to capture and leverage both local deep information and global wide context. It fuses different kinds of networks and hyper-parameters horizontally rather than simply stacking them layer by layer vertically. Experiments on the Stanford Question Answering Dataset (SQuAD) show that our proposed architecture achieves good performance.

Download Full-text

Comparing Attention-Based Convolutional and Recurrent Neural Networks: Success and Limitations in Machine Reading Comprehension

10.18653/v1/k18-1011 ◽

2018 ◽

Cited By ~ 5

Author(s):

Matthias Blohm ◽

Glorianna Jagfeld ◽

Ekta Sood ◽

Xiang Yu ◽

Ngoc Thang Vu

Keyword(s):

Neural Networks ◽

Reading Comprehension ◽

Recurrent Neural Networks ◽

Machine Reading

Download Full-text

Direction Finding Using Convolutional Neural Networks and Convolutional Recurrent Neural Networks

2020 28th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu49456.2020.9302448 ◽

2020 ◽

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Recurrent Neural Networks ◽

Direction Finding

Download Full-text

A Study of OWA Operators Learned in Convolutional Neural Networks

Applied Sciences ◽

10.3390/app11167195 ◽

2021 ◽

Vol 11 (16) ◽

pp. 7195

Author(s):

Iris Dominguez-Catena ◽

Daniel Paternain ◽

Mikel Galar

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Practical Method ◽

Local Information ◽

Weighted Averaging ◽

Global Information ◽

Owa Operator ◽

Owa Operators ◽

Ordered Weighted Averaging

Ordered Weighted Averaging (OWA) operators have been integrated in Convolutional Neural Networks (CNNs) for image classification through the OWA layer. This layer lets the CNN integrate global information about the image in the early stages, where most CNN architectures only allow for the exploitation of local information. As a side effect of this integration, the OWA layer becomes a practical method for the determination of OWA operator weights, which is usually a difficult task that complicates the integration of these operators in other fields. In this paper, we explore the weights learned for the OWA operators inside the OWA layer, characterizing them through their basic properties of orness and dispersion. We also compare them to some families of OWA operators, namely the Binomial OWA operator, the Stancu OWA operator and the exponential RIM OWA operator, finding examples that are currently impossible to generalize through these parameterizations.

Download Full-text

Real-time classification of hand movements as a basis for intuitive control of grasp neuroprostheses

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-2011 ◽

2020 ◽

Vol 6 (2) ◽

Author(s):

Dmitry Amelin ◽

Ivan Potapov ◽

Josep Cardona Audí ◽

Andreas Kogut ◽

Rüdiger Rupp ◽

...

Keyword(s):

Neural Networks ◽

Standard Deviation ◽

Real Time ◽

Convolutional Neural Networks ◽

Recurrent Neural Networks ◽

Healthy Subjects ◽

Hand Movements ◽

Cord Injury ◽

Field Programmable

AbstractThis paper reports on the evaluation of recurrent and convolutional neural networks as real-time grasp phase classifiers for future control of neuroprostheses for people with high spinal cord injury. A field-programmable gate array has been chosen as an implementation platform due to its form factor and ability to perform parallel computations, which are specific for the selected neural networks. Three different phases of two grasp patterns and the additional open hand pattern were predicted by means of surface Electromyography (EMG) signals (i.e. Seven classes in total). Across seven healthy subjects, CNN (Convolutional Neural Networks) and RNN (Recurrent Neural Networks) had a mean accuracy of 85.23% with a standard deviation of 4.77% and 112 µs per prediction and 83.30% with a standard deviation of 4.36% and 40 µs per prediction, respectively.

Download Full-text

Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets

Wireless Communications and Mobile Computing ◽

10.1155/2021/5375334 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Changchang Zeng ◽

Shaobo Li

Keyword(s):

Reading Comprehension ◽

Language Processing ◽

Question Answering ◽

Multiple Choice ◽

Length Distribution ◽

Research Field ◽

Evaluation Framework ◽

Language Models ◽

Training Objective ◽

Machine Reading

Machine reading comprehension (MRC) is a challenging natural language processing (NLP) task. It has a wide application potential in the fields of question answering robots, human-computer interactions in mobile virtual reality systems, etc. Recently, the emergence of pretrained models (PTMs) has brought this research field into a new era, in which the training objective plays a key role. The masked language model (MLM) is a self-supervised training objective widely used in various PTMs. With the development of training objectives, many variants of MLM have been proposed, such as whole word masking, entity masking, phrase masking, and span masking. In different MLMs, the length of the masked tokens is different. Similarly, in different machine reading comprehension tasks, the length of the answer is also different, and the answer is often a word, phrase, or sentence. Thus, in MRC tasks with different answer lengths, whether the length of MLM is related to performance is a question worth studying. If this hypothesis is true, it can guide us on how to pretrain the MLM with a relatively suitable mask length distribution for MRC tasks. In this paper, we try to uncover how much of MLM’s success in the machine reading comprehension tasks comes from the correlation between masking length distribution and answer length in the MRC dataset. In order to address this issue, herein, (1) we propose four MRC tasks with different answer length distributions, namely, the short span extraction task, long span extraction task, short multiple-choice cloze task, and long multiple-choice cloze task; (2) four Chinese MRC datasets are created for these tasks; (3) we also have pretrained four masked language models according to the answer length distributions of these datasets; and (4) ablation experiments are conducted on the datasets to verify our hypothesis. The experimental results demonstrate that our hypothesis is true. On four different machine reading comprehension datasets, the performance of the model with correlation length distribution surpasses the model without correlation.

Download Full-text

Importance of the Single-Span Task Formulation to Extractive Question-answering

10.5121/csit.2020.101809 ◽

2020 ◽

Author(s):

Marie-Anne Xu ◽

Rahul Khanna

Keyword(s):

Reading Comprehension ◽

Future Development ◽

Recent Progress ◽

Question Answering ◽

Span Task ◽

Consistent Performance ◽

Machine Reading

Recent progress in machine reading comprehension and question-answering has allowed machines to reach and even surpass human question-answering. However, the majority of these questions have only one answer, and more substantial testing on questions with multiple answers, or multi-span questions, has not yet been applied. Thus, we introduce a newly compiled dataset consisting of questions with multiple answers that originate from previously existing datasets. In addition, we run BERT-based models pre-trained for question-answering on our constructed dataset to evaluate their reading comprehension abilities. Among the three of BERT-based models we ran, RoBERTa exhibits the highest consistent performance, regardless of size. We find that all our models perform similarly on this new, multi-span dataset (21.492% F1) compared to the single-span source datasets (~33.36% F1). While the models tested on the source datasets were slightly fine-tuned, performance is similar enough to judge that task formulation does not drastically affect question-answering abilities. Our evaluations indicate that these models are indeed capable of adjusting to answer questions that require multiple answers. We hope that our findings will assist future development in questionanswering and improve existing question-answering products and methods.

Download Full-text

Incorporating Statistical Features in Convolutional Neural Networks for Question Answering with Financial Data

Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18 ◽

10.1145/3184558.3191826 ◽

2018 ◽

Author(s):

Shijia E. ◽

Shiyao Xu ◽

Yang Xiang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Question Answering ◽

Financial Data ◽

Statistical Features

Download Full-text

Robust Image Classification with Cognitive-Driven Color Priors

Electronics ◽

10.3390/electronics9111837 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1837

Author(s):

Peng Gu ◽

Chengfei Zhu ◽

Xiaosong Lan ◽

Jie Wang ◽

Shuxiao Li

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

Human Memory ◽

Layer By Layer ◽

Training Methods ◽

Prior Model ◽

Benchmark Datasets ◽

Robust Image ◽

Classification Probability

Existing image classification methods based on convolutional neural networks usually use a large number of samples to learn classification features hierarchically, causing the problems of over-fitting and error propagation layer by layer. Thus, they are vulnerable to adversarial samples generated by adding imperceptible disturbances to input samples. To address the above issue, we propose a cognitive-driven color prior model to memorize the color attributes of target samples inspired by the characteristics of human memory. At inference stage, color priors are indexed from the memory and fused with features of convolutional neural networks to achieve robust image classification. The proposed color prior model is cognitive-driven and has no training parameters, thus it has strong generalization and can effectively defend against adversarial samples. In addition, our method directly combines the features of the prior model with the classification probability of the convolutional neural network, without changing the network structure and its parameters of the existing algorithm. It can be combined with other adversarial attack defense methods, such as various preprocessing modules such as PixelDefense or adversarial training methods, to improve the robustness of image classification. Experiments on several benchmark datasets show that the proposed method improves the anti-interference ability of image classification algorithms.

Download Full-text

UQuAD1.0: Development of an Urdu Question Answering Training Data for Machine Reading Comprehension

10.36227/techrxiv.16924255 ◽

2021 ◽

Author(s):

Samreen Ahmed ◽

shakeel khoja

Keyword(s):

Reading Comprehension ◽

Machine Translation ◽

Large Scale ◽

Question Answering ◽

Training Data ◽

Significant Progress ◽

Rule Based ◽

Low Resource ◽

Machine Reading ◽

Answer Format

<p>In recent years, low-resource Machine Reading Comprehension (MRC) has made significant progress, with models getting remarkable performance on various language datasets. However, none of these models have been customized for the Urdu language. This work explores the semi-automated creation of the Urdu Question Answering Dataset (UQuAD1.0) by combining machine-translated SQuAD with human-generated samples derived from Wikipedia articles and Urdu RC worksheets from Cambridge O-level books. UQuAD1.0 is a large-scale Urdu dataset intended for extractive machine reading comprehension tasks consisting of 49k question Answers pairs in question, passage, and answer format. In UQuAD1.0, 45000 pairs of QA were generated by machine translation of the original SQuAD1.0 and approximately 4000 pairs via crowdsourcing. In this study, we used two types of MRC models: rule-based baseline and advanced Transformer-based models. However, we have discovered that the latter outperforms the others; thus, we have decided to concentrate solely on Transformer-based architectures. Using XLMRoBERTa and multi-lingual BERT, we acquire an F<sub>1</sub> score of 0.66 and 0.63, respectively.</p>

Download Full-text

MMM: Multi-Stage Multi-Task Learning for Multi-Choice Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6310 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8010-8017 ◽

Cited By ~ 1

Author(s):

Di Jin ◽

Shuyang Gao ◽

Jiun-Yu Kao ◽

Tagyoung Chung ◽

Dilek Hakkani-tur

Keyword(s):

Reading Comprehension ◽

Question Answering ◽

Limited Data ◽

Learning Stage ◽

Learning Framework ◽

Task Learning ◽

Multi Stage ◽

Comprehension Skills ◽

Machine Reading ◽

Reading Comprehension Skills

Machine Reading Comprehension (MRC) for question answering (QA), which aims to answer a question given the relevant context passages, is an important way to test the ability of intelligence systems to understand human language. Multiple-Choice QA (MCQA) is one of the most difficult tasks in MRC because it often requires more advanced reading comprehension skills such as logical reasoning, summarization, and arithmetic operations, compared to the extractive counterpart where answers are usually spans of text within given passages. Moreover, most existing MCQA datasets are small in size, making the task even harder. We introduce MMM, a Multi-stage Multi-task learning framework for Multi-choice reading comprehension. Our method involves two sequential stages: coarse-tuning stage using out-of-domain datasets and multi-task learning stage using a larger in-domain dataset to help model generalize better with limited data. Furthermore, we propose a novel multi-step attention network (MAN) as the top-level classifier for this task. We demonstrate MMM significantly advances the state-of-the-art on four representative MCQA datasets.

Download Full-text