Neural Machine Reading Comprehension: Methods and Trends

Machine Reading Comprehension (MRC) is a challenging Natural Language Processing (NLP) research field with wide real-world applications. The great progress of this field in recent years is mainly due to the emergence of large-scale datasets and deep learning. At present, a lot of MRC models have already surpassed human performance on various benchmark datasets despite the obvious giant gap between existing MRC models and genuine human-level reading comprehension. This shows the need for improving existing datasets, evaluation metrics, and models to move current MRC models toward “real” understanding. To address the current lack of comprehensive survey of existing MRC tasks, evaluation metrics, and datasets, herein, (1) we analyze 57 MRC tasks and datasets and propose a more precise classification method of MRC tasks with 4 different attributes; (2) we summarized 9 evaluation metrics of MRC tasks, 7 attributes and 10 characteristics of MRC datasets; (3) We also discuss key open issues in MRC research and highlighted future research directions. In addition, we have collected, organized, and published our data on the companion website where MRC researchers could directly access each MRC dataset, papers, baseline projects, and the leaderboard.

Download Full-text

Review of Deep Learning Techniques for Improving the Performance of Machine Reading Comprehension Problem

2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS) ◽

10.1109/iciccs48265.2020.9121015 ◽

2020 ◽

Author(s):

Henna Farooq ◽

Baijnath kaushik

Keyword(s):

Reading Comprehension ◽

Deep Learning ◽

Learning Techniques ◽

Machine Reading

Download Full-text

Deep Learning Techniques on Text Classification Using Natural Language Processing (NLP) In Social Healthcare Network: A Comprehensive Survey

2021 3rd International Conference on Signal Processing and Communication (ICPSC) ◽

10.1109/icspc51351.2021.9451752 ◽

2021 ◽

Author(s):

PM. Lavanya ◽

E. Sasikala

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Classification ◽

Healthcare Network ◽

Learning Techniques ◽

Comprehensive Survey

Download Full-text

Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets

Wireless Communications and Mobile Computing ◽

10.1155/2021/5375334 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Changchang Zeng ◽

Shaobo Li

Keyword(s):

Reading Comprehension ◽

Language Processing ◽

Question Answering ◽

Multiple Choice ◽

Length Distribution ◽

Research Field ◽

Evaluation Framework ◽

Language Models ◽

Training Objective ◽

Machine Reading

Machine reading comprehension (MRC) is a challenging natural language processing (NLP) task. It has a wide application potential in the fields of question answering robots, human-computer interactions in mobile virtual reality systems, etc. Recently, the emergence of pretrained models (PTMs) has brought this research field into a new era, in which the training objective plays a key role. The masked language model (MLM) is a self-supervised training objective widely used in various PTMs. With the development of training objectives, many variants of MLM have been proposed, such as whole word masking, entity masking, phrase masking, and span masking. In different MLMs, the length of the masked tokens is different. Similarly, in different machine reading comprehension tasks, the length of the answer is also different, and the answer is often a word, phrase, or sentence. Thus, in MRC tasks with different answer lengths, whether the length of MLM is related to performance is a question worth studying. If this hypothesis is true, it can guide us on how to pretrain the MLM with a relatively suitable mask length distribution for MRC tasks. In this paper, we try to uncover how much of MLM’s success in the machine reading comprehension tasks comes from the correlation between masking length distribution and answer length in the MRC dataset. In order to address this issue, herein, (1) we propose four MRC tasks with different answer length distributions, namely, the short span extraction task, long span extraction task, short multiple-choice cloze task, and long multiple-choice cloze task; (2) four Chinese MRC datasets are created for these tasks; (3) we also have pretrained four masked language models according to the answer length distributions of these datasets; and (4) ablation experiments are conducted on the datasets to verify our hypothesis. The experimental results demonstrate that our hypothesis is true. On four different machine reading comprehension datasets, the performance of the model with correlation length distribution surpasses the model without correlation.

Download Full-text

Deep Learning Applications in Agriculture

Artificial Intelligence and IoT-Based Technologies for Sustainable Farming and Smart Agriculture - Advances in Environmental Engineering and Green Technologies ◽

10.4018/978-1-7998-1722-2.ch020 ◽

2021 ◽

pp. 325-345

Author(s):

Hari Kishan Kondaveeti ◽

Gonugunta Priyatham Brahma ◽

Dandhibhotla Vijaya Sahithi

Keyword(s):

Machine Learning ◽

Big Data ◽

Deep Learning ◽

Rapid Growth ◽

Agricultural Sector ◽

The Other ◽

Learning Methods ◽

Agriculture Sector ◽

Learning Techniques ◽

Recent Trends

Deep learning (DL), a part of machine learning (ML), comprises a contemporary technique for processing the images and analyzing the big data with promising outcomes. Deep learning methods are successfully being used in various sectors to gain better results. Agriculture sector is one of the sectors that could be benefitted from the deep learning techniques since the current agriculture techniques cannot keep up with the rapid growth in population. In this chapter, the recent trends in the applications of deep learning techniques in the agricultural sector and the survey of the research efforts that employ deep learning techniques are going to be discussed. Also, the models that are implemented are going to be analyzed and compared with the other existing models.

Download Full-text

Deep Learning for Generic Object Detection: A Survey

International Journal of Computer Vision ◽

10.1007/s11263-019-01247-4 ◽

2019 ◽

Vol 128 (2) ◽

pp. 261-318 ◽

Cited By ~ 154

Author(s):

Li Liu ◽

Wanli Ouyang ◽

Xiaogang Wang ◽

Paul Fieguth ◽

Jie Chen ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Rapid Evolution ◽

Feature Representation ◽

Future Research ◽

Context Modeling ◽

Feature Representations ◽

Learning Techniques ◽

Comprehensive Survey ◽

Object Feature

Abstract Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

Download Full-text

Complex Data Imputation by Auto-Encoders and Convolutional Neural Networks—A Case Study on Genome Gap-Filling

Computers ◽

10.3390/computers9020037 ◽

2020 ◽

Vol 9 (2) ◽

pp. 37 ◽

Cited By ~ 1

Author(s):

Luca Cappelletti ◽

Tommaso Fontana ◽

Guido Walter Di Donato ◽

Lorenzo Di Tucci ◽

Elena Casiraghi ◽

...

Keyword(s):

Deep Learning ◽

Missing Data ◽

State Of The Art ◽

The State ◽

Complex Data ◽

Data Imputation ◽

Genome Sequences ◽

Missing Data Imputation ◽

The Past ◽

Learning Techniques

Missing data imputation has been a hot topic in the past decade, and many state-of-the-art works have been presented to propose novel, interesting solutions that have been applied in a variety of fields. In the past decade, the successful results achieved by deep learning techniques have opened the way to their application for solving difficult problems where human skill is not able to provide a reliable solution. Not surprisingly, some deep learners, mainly exploiting encoder-decoder architectures, have also been designed and applied to the task of missing data imputation. However, most of the proposed imputation techniques have not been designed to tackle “complex data”, that is high dimensional data belonging to datasets with huge cardinality and describing complex problems. Precisely, they often need critical parameters to be manually set or exploit complex architecture and/or training phases that make their computational load impracticable. In this paper, after clustering the state-of-the-art imputation techniques into three broad categories, we briefly review the most representative methods and then describe our data imputation proposals, which exploit deep learning techniques specifically designed to handle complex data. Comparative tests on genome sequences show that our deep learning imputers outperform the state-of-the-art KNN-imputation method when filling gaps in human genome sequences.

Download Full-text

You can Try without Visiting: A Comprehensive Survey on Virtually Try-on Outfits

10.36227/techrxiv.13904099.v2 ◽

2021 ◽

Author(s):

Hajer Ghodhbani ◽

Adel Alimi ◽

Mohamed Neji ◽

Imran Razzak

Keyword(s):

Deep Learning ◽

Literature Review ◽

Research Field ◽

Future Research ◽

Fashion Industry ◽

Research Directions ◽

Comprehensive Literature Review ◽

Benchmark Datasets ◽

Comprehensive Survey ◽

Future Research Directions

<p>Our work aims to conduct a comprehensive literature review of deep learning methods applied in the fashion industry and, especially, the image-based virtual fitting task by citing research works published in the last years. We have summarized their challenges, their main frameworks, the popular benchmark datasets, and the different evaluation metrics. Also, some promising future research directions are discussed to propose improvements in this research field.</p>

Download Full-text

Deep Learning for Cyber Security Applications: A Comprehensive Survey

10.36227/techrxiv.16748161 ◽

2021 ◽

Author(s):

vinayakumar R ◽

Mamoun Alazab ◽

Soman KP ◽

Sriram Srinivasan ◽

Sitalakshmi Venkatraman ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Cyber Security ◽

Smart Cities ◽

Critical Discussion ◽

Future Research ◽

Next Generation ◽

Security Applications ◽

The Past ◽

Comprehensive Survey

Deep Learning (DL), a novel form of machine learning (ML) is gaining much research interest due to its successful application in many classical artificial intelligence (AI) tasks as compared to classical ML algorithms (CMLAs). Recently, DL architectures are being innovatively modelled for diverse applications in the area of cyber security. The literature is now growing with DL architectures and their variations for exploring different innovative DL models and prototypes that can be tailored to suit specific cyber security applications. However, there is a gap in literature for a comprehensive survey reporting on such research studies. Many of the survey-based research have a focus on specific DL architectures and certain types of malicious attacks within a limited cyber security problem scenario of the past and lack futuristic review. This paper aims at providing a well-rounded and thorough survey of the past, present, and future DL architectures including next-generation cyber security scenarios related to intelligent automation, Internet of Things (IoT), Big Data (BD), Blockchain, cloud and edge technologies. <br>This paper presents a tutorial-style comprehensive review of the state-of-the-art DL architectures for diverse applications in cyber security by comparing and analysing the contributions and challenges from various recent research papers. Firstly, the uniqueness of the survey is in reporting the use of DL architectures for an extensive set of cybercrime detection approaches such as intrusion detection, malware and botnet detection, spam and phishing detection, network traffic analysis, binary analysis, insider threat detection, CAPTCHA analysis, and steganography. Secondly, the survey covers key DL architectures in cyber security application domains such as cryptography, cloud security, biometric security, IoT and edge computing. Thirdly, the need for DL based research is discussed for the next generation cyber security applications in cyber physical systems (CPS) that leverage on BD analytics, natural language processing (NLP), signal and image processing and blockchain technology for smart cities and Industry 4.0 of the future. Finally, a critical discussion on open challenges and new proposed DL architecture contributes towards future research directions.

Download Full-text

Deep Learning for Cyber Security Applications: A Comprehensive Survey

10.36227/techrxiv.16748161.v1 ◽

2021 ◽

Author(s):

vinayakumar R ◽

Mamoun Alazab ◽

Soman KP ◽

Sriram Srinivasan ◽

Sitalakshmi Venkatraman ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Cyber Security ◽

Smart Cities ◽

Critical Discussion ◽

Future Research ◽

Next Generation ◽

Security Applications ◽

The Past ◽

Comprehensive Survey

Deep Learning (DL), a novel form of machine learning (ML) is gaining much research interest due to its successful application in many classical artificial intelligence (AI) tasks as compared to classical ML algorithms (CMLAs). Recently, DL architectures are being innovatively modelled for diverse applications in the area of cyber security. The literature is now growing with DL architectures and their variations for exploring different innovative DL models and prototypes that can be tailored to suit specific cyber security applications. However, there is a gap in literature for a comprehensive survey reporting on such research studies. Many of the survey-based research have a focus on specific DL architectures and certain types of malicious attacks within a limited cyber security problem scenario of the past and lack futuristic review. This paper aims at providing a well-rounded and thorough survey of the past, present, and future DL architectures including next-generation cyber security scenarios related to intelligent automation, Internet of Things (IoT), Big Data (BD), Blockchain, cloud and edge technologies. <br>This paper presents a tutorial-style comprehensive review of the state-of-the-art DL architectures for diverse applications in cyber security by comparing and analysing the contributions and challenges from various recent research papers. Firstly, the uniqueness of the survey is in reporting the use of DL architectures for an extensive set of cybercrime detection approaches such as intrusion detection, malware and botnet detection, spam and phishing detection, network traffic analysis, binary analysis, insider threat detection, CAPTCHA analysis, and steganography. Secondly, the survey covers key DL architectures in cyber security application domains such as cryptography, cloud security, biometric security, IoT and edge computing. Thirdly, the need for DL based research is discussed for the next generation cyber security applications in cyber physical systems (CPS) that leverage on BD analytics, natural language processing (NLP), signal and image processing and blockchain technology for smart cities and Industry 4.0 of the future. Finally, a critical discussion on open challenges and new proposed DL architecture contributes towards future research directions.

Download Full-text