scholarly journals Neural Machine Reading Comprehension: Methods and Trends

2019 ◽  
Vol 9 (18) ◽  
pp. 3698 ◽  
Author(s):  
Shanshan Liu ◽  
Xin Zhang ◽  
Sheng Zhang ◽  
Hui Wang ◽  
Weiming Zhang

Machine reading comprehension (MRC), which requires a machine to answer questions based on a given context, has attracted increasing attention with the incorporation of various deep-learning techniques over the past few years. Although research on MRC based on deep learning is flourishing, there remains a lack of a comprehensive survey summarizing existing approaches and recent trends, which motivated the work presented in this article. Specifically, we give a thorough review of this research field, covering different aspects including (1) typical MRC tasks: their definitions, differences, and representative datasets; (2) the general architecture of neural MRC: the main modules and prevalent approaches to each; and (3) new trends: some emerging areas in neural MRC as well as the corresponding challenges. Finally, considering what has been achieved so far, the survey also envisages what the future may hold by discussing the open issues left to be addressed.

2020 ◽  
Vol 10 (21) ◽  
pp. 7640
Author(s):  
Changchang Zeng ◽  
Shaobo Li ◽  
Qin Li ◽  
Jie Hu ◽  
Jianjun Hu

Machine Reading Comprehension (MRC) is a challenging Natural Language Processing (NLP) research field with wide real-world applications. The great progress of this field in recent years is mainly due to the emergence of large-scale datasets and deep learning. At present, a lot of MRC models have already surpassed human performance on various benchmark datasets despite the obvious giant gap between existing MRC models and genuine human-level reading comprehension. This shows the need for improving existing datasets, evaluation metrics, and models to move current MRC models toward “real” understanding. To address the current lack of comprehensive survey of existing MRC tasks, evaluation metrics, and datasets, herein, (1) we analyze 57 MRC tasks and datasets and propose a more precise classification method of MRC tasks with 4 different attributes; (2) we summarized 9 evaluation metrics of MRC tasks, 7 attributes and 10 characteristics of MRC datasets; (3) We also discuss key open issues in MRC research and highlighted future research directions. In addition, we have collected, organized, and published our data on the companion website where MRC researchers could directly access each MRC dataset, papers, baseline projects, and the leaderboard.


2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Changchang Zeng ◽  
Shaobo Li

Machine reading comprehension (MRC) is a challenging natural language processing (NLP) task. It has a wide application potential in the fields of question answering robots, human-computer interactions in mobile virtual reality systems, etc. Recently, the emergence of pretrained models (PTMs) has brought this research field into a new era, in which the training objective plays a key role. The masked language model (MLM) is a self-supervised training objective widely used in various PTMs. With the development of training objectives, many variants of MLM have been proposed, such as whole word masking, entity masking, phrase masking, and span masking. In different MLMs, the length of the masked tokens is different. Similarly, in different machine reading comprehension tasks, the length of the answer is also different, and the answer is often a word, phrase, or sentence. Thus, in MRC tasks with different answer lengths, whether the length of MLM is related to performance is a question worth studying. If this hypothesis is true, it can guide us on how to pretrain the MLM with a relatively suitable mask length distribution for MRC tasks. In this paper, we try to uncover how much of MLM’s success in the machine reading comprehension tasks comes from the correlation between masking length distribution and answer length in the MRC dataset. In order to address this issue, herein, (1) we propose four MRC tasks with different answer length distributions, namely, the short span extraction task, long span extraction task, short multiple-choice cloze task, and long multiple-choice cloze task; (2) four Chinese MRC datasets are created for these tasks; (3) we also have pretrained four masked language models according to the answer length distributions of these datasets; and (4) ablation experiments are conducted on the datasets to verify our hypothesis. The experimental results demonstrate that our hypothesis is true. On four different machine reading comprehension datasets, the performance of the model with correlation length distribution surpasses the model without correlation.


Author(s):  
Hari Kishan Kondaveeti ◽  
Gonugunta Priyatham Brahma ◽  
Dandhibhotla Vijaya Sahithi

Deep learning (DL), a part of machine learning (ML), comprises a contemporary technique for processing the images and analyzing the big data with promising outcomes. Deep learning methods are successfully being used in various sectors to gain better results. Agriculture sector is one of the sectors that could be benefitted from the deep learning techniques since the current agriculture techniques cannot keep up with the rapid growth in population. In this chapter, the recent trends in the applications of deep learning techniques in the agricultural sector and the survey of the research efforts that employ deep learning techniques are going to be discussed. Also, the models that are implemented are going to be analyzed and compared with the other existing models.


2019 ◽  
Vol 128 (2) ◽  
pp. 261-318 ◽  
Author(s):  
Li Liu ◽  
Wanli Ouyang ◽  
Xiaogang Wang ◽  
Paul Fieguth ◽  
Jie Chen ◽  
...  

Abstract Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.


Computers ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 37 ◽  
Author(s):  
Luca Cappelletti ◽  
Tommaso Fontana ◽  
Guido Walter Di Donato ◽  
Lorenzo Di Tucci ◽  
Elena Casiraghi ◽  
...  

Missing data imputation has been a hot topic in the past decade, and many state-of-the-art works have been presented to propose novel, interesting solutions that have been applied in a variety of fields. In the past decade, the successful results achieved by deep learning techniques have opened the way to their application for solving difficult problems where human skill is not able to provide a reliable solution. Not surprisingly, some deep learners, mainly exploiting encoder-decoder architectures, have also been designed and applied to the task of missing data imputation. However, most of the proposed imputation techniques have not been designed to tackle “complex data”, that is high dimensional data belonging to datasets with huge cardinality and describing complex problems. Precisely, they often need critical parameters to be manually set or exploit complex architecture and/or training phases that make their computational load impracticable. In this paper, after clustering the state-of-the-art imputation techniques into three broad categories, we briefly review the most representative methods and then describe our data imputation proposals, which exploit deep learning techniques specifically designed to handle complex data. Comparative tests on genome sequences show that our deep learning imputers outperform the state-of-the-art KNN-imputation method when filling gaps in human genome sequences.


2021 ◽  
Author(s):  
Hajer Ghodhbani ◽  
Adel Alimi ◽  
Mohamed Neji ◽  
Imran Razzak

<p>Our work aims to conduct a comprehensive literature review of deep learning methods applied in the fashion industry and, especially, the image-based virtual fitting task by citing research works published in the last years. We have summarized their challenges, their main frameworks, the popular benchmark datasets, and the different evaluation metrics. Also, some promising future research directions are discussed to propose improvements in this research field.</p>


2021 ◽  
Author(s):  
vinayakumar R ◽  
Mamoun Alazab ◽  
Soman KP ◽  
Sriram Srinivasan ◽  
Sitalakshmi Venkatraman ◽  
...  

Deep Learning (DL), a novel form of machine learning (ML) is gaining much research interest due to its successful application in many classical artificial intelligence (AI) tasks as compared to classical ML algorithms (CMLAs). Recently, DL architectures are being innovatively modelled for diverse applications in the area of cyber security. The literature is now growing with DL architectures and their variations for exploring different innovative DL models and prototypes that can be tailored to suit specific cyber security applications. However, there is a gap in literature for a comprehensive survey reporting on such research studies. Many of the survey-based research have a focus on specific DL architectures and certain types of malicious attacks within a limited cyber security problem scenario of the past and lack futuristic review. This paper aims at providing a well-rounded and thorough survey of the past, present, and future DL architectures including next-generation cyber security scenarios related to intelligent automation, Internet of Things (IoT), Big Data (BD), Blockchain, cloud and edge technologies. <br>This paper presents a tutorial-style comprehensive review of the state-of-the-art DL architectures for diverse applications in cyber security by comparing and analysing the contributions and challenges from various recent research papers. Firstly, the uniqueness of the survey is in reporting the use of DL architectures for an extensive set of cybercrime detection approaches such as intrusion detection, malware and botnet detection, spam and phishing detection, network traffic analysis, binary analysis, insider threat detection, CAPTCHA analysis, and steganography. Secondly, the survey covers key DL architectures in cyber security application domains such as cryptography, cloud security, biometric security, IoT and edge computing. Thirdly, the need for DL based research is discussed for the next generation cyber security applications in cyber physical systems (CPS) that leverage on BD analytics, natural language processing (NLP), signal and image processing and blockchain technology for smart cities and Industry 4.0 of the future. Finally, a critical discussion on open challenges and new proposed DL architecture contributes towards future research directions.


2021 ◽  
Author(s):  
vinayakumar R ◽  
Mamoun Alazab ◽  
Soman KP ◽  
Sriram Srinivasan ◽  
Sitalakshmi Venkatraman ◽  
...  

Deep Learning (DL), a novel form of machine learning (ML) is gaining much research interest due to its successful application in many classical artificial intelligence (AI) tasks as compared to classical ML algorithms (CMLAs). Recently, DL architectures are being innovatively modelled for diverse applications in the area of cyber security. The literature is now growing with DL architectures and their variations for exploring different innovative DL models and prototypes that can be tailored to suit specific cyber security applications. However, there is a gap in literature for a comprehensive survey reporting on such research studies. Many of the survey-based research have a focus on specific DL architectures and certain types of malicious attacks within a limited cyber security problem scenario of the past and lack futuristic review. This paper aims at providing a well-rounded and thorough survey of the past, present, and future DL architectures including next-generation cyber security scenarios related to intelligent automation, Internet of Things (IoT), Big Data (BD), Blockchain, cloud and edge technologies. <br>This paper presents a tutorial-style comprehensive review of the state-of-the-art DL architectures for diverse applications in cyber security by comparing and analysing the contributions and challenges from various recent research papers. Firstly, the uniqueness of the survey is in reporting the use of DL architectures for an extensive set of cybercrime detection approaches such as intrusion detection, malware and botnet detection, spam and phishing detection, network traffic analysis, binary analysis, insider threat detection, CAPTCHA analysis, and steganography. Secondly, the survey covers key DL architectures in cyber security application domains such as cryptography, cloud security, biometric security, IoT and edge computing. Thirdly, the need for DL based research is discussed for the next generation cyber security applications in cyber physical systems (CPS) that leverage on BD analytics, natural language processing (NLP), signal and image processing and blockchain technology for smart cities and Industry 4.0 of the future. Finally, a critical discussion on open challenges and new proposed DL architecture contributes towards future research directions.


Sign in / Sign up

Export Citation Format

Share Document