Multi-Task Learning with Generative Adversarial Training for Multi-Passage Machine Reading Comprehension

Qiyu Ren; Xiang Cheng; Sen Su

doi:10.1609/aaai.v34i05.6396

Multi-Task Learning with Generative Adversarial Training for Multi-Passage Machine Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6396 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8705-8712

Author(s):

Qiyu Ren ◽

Xiang Cheng ◽

Sen Su

Keyword(s):

Reading Comprehension ◽

Open Domain ◽

Training Process ◽

Task Learning ◽

Novel Approach ◽

Adversarial Training ◽

Model Training ◽

Final Answer ◽

Machine Reading ◽

Candidate Set

Multi-passage machine reading comprehension (MRC) aims to answer a question by multiple passages. Existing multi-passage MRC approaches have shown that employing passages with and without golden answers (i.e. labeled and unlabeled passages) for model training can improve prediction accuracy. In this paper, we present MG-MRC, a novel approach for multi-passage MRC via multi-task learning with generative adversarial training. MG-MRC adopts the extract-then-select framework, where an extractor is first used to predict answer candidates, then a selector is used to choose the final answer. In MG-MRC, we adopt multi-task learning to train the extractor by using both labeled and unlabeled passages. In particular, we use labeled passages to train the extractor by supervised learning, while using unlabeled passages to train the extractor by generative adversarial training, where the extractor is regarded as the generator and a discriminator is introduced to evaluate the generated answer candidates. Moreover, to train the extractor by backpropagation in the generative adversarial training process, we propose a hybrid method which combines boundary-based and content-based extracting methods to produce the answer candidate set and its representation. The experimental results on three open-domain QA datasets confirm the effectiveness of our approach.

Download Full-text

Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016276 ◽

2019 ◽

Vol 33 ◽

pp. 6276-6283 ◽

Cited By ~ 6

Author(s):

Zhipeng Chen ◽

Yiming Cui ◽

Wentao Ma ◽

Shijin Wang ◽

Guoping Hu

Keyword(s):

Reading Comprehension ◽

Mutual Information ◽

Spatial Attention ◽

State Of The Art ◽

Multiple Choice ◽

Multiple Choice Questions ◽

Attention Model ◽

Novel Approach ◽

Proposed Model ◽

Machine Reading

Machine Reading Comprehension (MRC) with multiplechoice questions requires the machine to read given passage and select the correct answer among several candidates. In this paper, we propose a novel approach called Convolutional Spatial Attention (CSA) model which can better handle the MRC with multiple-choice questions. The proposed model could fully extract the mutual information among the passage, question, and the candidates, to form the enriched representations. Furthermore, to merge various attention results, we propose to use convolutional operation to dynamically summarize the attention values within the different size of regions. Experimental results show that the proposed model could give substantial improvements over various state-of- the-art systems on both RACE and SemEval-2018 Task11 datasets.

Download Full-text

A Multi-Task Learning Machine Reading Comprehension Model for Noisy Document (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7254 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13963-13964

Author(s):

Zhijing Wu ◽

Hua Xu

Keyword(s):

Reading Comprehension ◽

Neural Models ◽

Successful Performance ◽

Task Learning ◽

Benchmark Datasets ◽

Robustness To Noise ◽

Learning Machine ◽

Machine Reading ◽

Coarse To Fine ◽

Noise Experiment

Current neural models for Machine Reading Comprehension (MRC) have achieved successful performance in recent years. However, the model is too fragile and lack robustness to tackle the imperceptible adversarial perturbations to the input. In this work, we propose a multi-task learning MRC model with a hierarchical knowledge enrichment to further improve the robustness for noisy document. Our model follows a typical encode-align-decode framework. Additionally, we apply a hierarchical method of adding background knowledge into the model from coarse-to-fine to enhance the language representations. Besides, we optimize our model by jointly training the answer span and unanswerability prediction, aiming to improve the robustness to noise. Experiment results on benchmark datasets confirm the superiority of our method, and our method can achieve competitive performance compared with other strong baselines.

Download Full-text

MMM: Multi-Stage Multi-Task Learning for Multi-Choice Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6310 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8010-8017 ◽

Cited By ~ 1

Author(s):

Di Jin ◽

Shuyang Gao ◽

Jiun-Yu Kao ◽

Tagyoung Chung ◽

Dilek Hakkani-tur

Keyword(s):

Reading Comprehension ◽

Question Answering ◽

Limited Data ◽

Learning Stage ◽

Learning Framework ◽

Task Learning ◽

Multi Stage ◽

Comprehension Skills ◽

Machine Reading ◽

Reading Comprehension Skills

Machine Reading Comprehension (MRC) for question answering (QA), which aims to answer a question given the relevant context passages, is an important way to test the ability of intelligence systems to understand human language. Multiple-Choice QA (MCQA) is one of the most difficult tasks in MRC because it often requires more advanced reading comprehension skills such as logical reasoning, summarization, and arithmetic operations, compared to the extractive counterpart where answers are usually spans of text within given passages. Moreover, most existing MCQA datasets are small in size, making the task even harder. We introduce MMM, a Multi-stage Multi-task learning framework for Multi-choice reading comprehension. Our method involves two sequential stages: coarse-tuning stage using out-of-domain datasets and multi-task learning stage using a larger in-domain dataset to help model generalize better with limited data. Furthermore, we propose a novel multi-step attention network (MAN) as the top-level classifier for this task. We demonstrate MMM significantly advances the state-of-the-art on four representative MCQA datasets.

Download Full-text

Semantics-Aware BERT for Language Understanding

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6510 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9628-9635

Author(s):

Zhuosheng Zhang ◽

Yuwei Wu ◽

Hai Zhao ◽

Zuchao Li ◽

Shuailiang Zhang ◽

...

Keyword(s):

Reading Comprehension ◽

Natural Language ◽

Language Model ◽

Fine Tuning ◽

Semantic Role Labeling ◽

Language Understanding ◽

Context Sensitive ◽

Language Representation ◽

Model Training ◽

Machine Reading

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks. However, the existing language representation models including ELMo, GPT and BERT only exploit plain context-sensitive features such as character or word embeddings. They rarely consider incorporating structured semantic information which can provide rich semantics for language representation. To promote natural language understanding, we propose to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduce an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone. SemBERT keeps the convenient usability of its BERT precursor in a light fine-tuning way without substantial task-specific modifications. Compared with BERT, semantics-aware BERT is as simple in concept but more powerful. It obtains new state-of-the-art or substantially improves results on ten reading comprehension and language inference tasks.

Download Full-text

Adversarial Training for Machine Reading Comprehension with Virtual Embeddings

10.18653/v1/2021.starsem-1.30 ◽

2021 ◽

Author(s):

Ziqing Yang ◽

Yiming Cui ◽

Chenglei Si ◽

Wanxiang Che ◽

Ting Liu ◽

...

Keyword(s):

Reading Comprehension ◽

Adversarial Training ◽

Machine Reading

Download Full-text

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

10.18653/v1/n19-1271 ◽

2019 ◽

Author(s):

Yichong Xu ◽

Xiaodong Liu ◽

Yelong Shen ◽

Jingjing Liu ◽

Jianfeng Gao

Keyword(s):

Reading Comprehension ◽

Task Learning ◽

Machine Reading

Download Full-text

Evaluating of Korean Machine Reading Comprehension Generalization Performance via Cross-, Blind and Open-Domain QA Dataset Assessment

Journal of KIISE ◽

10.5626/jok.2021.48.3.275 ◽

2021 ◽

Vol 48 (3) ◽

pp. 275-283

Author(s):

Joon-Ho Lim ◽

Hyun-ki Kim

Keyword(s):

Reading Comprehension ◽

Open Domain ◽

Generalization Performance ◽

Machine Reading

Download Full-text

A Robust Adversarial Training Approach to Machine Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6357 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8392-8400 ◽

Cited By ~ 1

Author(s):

Kai Liu ◽

Xin Liu ◽

An Yang ◽

Jing Liu ◽

Jinsong Su ◽

...

Keyword(s):

Reading Comprehension ◽

State Of The Art ◽

The State ◽

Training Data ◽

Training Dataset ◽

Original Dataset ◽

Training Approach ◽

Adversarial Examples ◽

Adversarial Training ◽

Machine Reading

Lacking robustness is a serious problem for Machine Reading Comprehension (MRC) models. To alleviate this problem, one of the most promising ways is to augment the training dataset with sophisticated designed adversarial examples. Generally, those examples are created by rules according to the observed patterns of successful adversarial attacks. Since the types of adversarial examples are innumerable, it is not adequate to manually design and enrich training data to defend against all types of adversarial attacks. In this paper, we propose a novel robust adversarial training approach to improve the robustness of MRC models in a more generic way. Given an MRC model well-trained on the original dataset, our approach dynamically generates adversarial examples based on the parameters of current model and further trains the model by using the generated examples in an iterative schedule. When applied to the state-of-the-art MRC models, including QANET, BERT and ERNIE2.0, our approach obtains significant and comprehensive improvements on 5 adversarial datasets constructed in different ways, without sacrificing the performance on the original SQuAD development set. Moreover, when coupled with other data augmentation strategy, our approach further boosts the overall performance on adversarial datasets and outperforms the state-of-the-art methods.

Download Full-text

Privacy-preserving Collaborative Training for Medical Image Analysis Based on Multi-Blockchain

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207323666201022110616 ◽

2020 ◽

Vol 23 ◽

Author(s):

Wanlu Zhang ◽

Qigang Wang ◽

Mei Li

Keyword(s):

Medical Image ◽

Data Privacy ◽

Medical Image Analysis ◽

Auxiliary Information ◽

Training Process ◽

Private Data ◽

Medical Institutions ◽

Model Training ◽

Collaborative Training ◽

Similar Task

Background: As artificial intelligence and big data analysis develop rapidly, data privacy, especially patient medical data privacy, is getting more and more attention. Objective: To strengthen the protection of private data while ensuring the model training process, this article introduces a multi-Blockchain-based decentralized collaborative machine learning training method for medical image analysis. In this way, researchers from different medical institutions are able to collaborate to train models without exchanging sensitive patient data. Method: Partial parameter update method is applied to prevent indirect privacy leakage during model propagation. With the peer-to-peer communication in the multi-Blockchain system, a machine learning task can leverage auxiliary information from another similar task in another Blockchain. In addition, after the collaborative training process, personalized models of different medical institutions will be trained. Results: The experimental results show that our method achieves similar performance with the centralized model-training method by collecting data sets of all participants and prevents private data leakage at the same time. Transferring auxiliary information from similar task on another Blockchain has also been proven to effectively accelerate model convergence and improve model accuracy, especially in the scenario of absence of data. Personalization training process further improves model performance. Conclusion: Our approach can effectively help researchers from different organizations to achieve collaborative training without disclosing their private data.

Download Full-text

Keyword extraction method for machine reading comprehension based on natural language processing

Journal of Physics Conference Series ◽

10.1088/1742-6596/1955/1/012072 ◽

2021 ◽

Vol 1955 (1) ◽

pp. 012072

Author(s):

Ruiheng Li ◽

Xuan Zhang ◽

Chengdong Li ◽

Zhongju Zheng ◽

Zihang Zhou ◽

...

Keyword(s):

Reading Comprehension ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Extraction Method ◽

Keyword Extraction ◽

Machine Reading

Download Full-text