An Empirical Study of Information Retrieval and Machine Reading Comprehension Algorithms for an Online Education Platform

This paper provides an empirical study of various techniques for information retrieval and machine reading comprehension in the context of an online education platform. More specifically, our application deals with answering conceptual students questions on technology courses. To that end we explore a pipeline consisting of a document retriever and a document reader. We find that using TF-IDF document representations for retrieving documents and RoBERTa deep learning model for reading documents and answering questions yields the best performance with respect to F-Score. In overall, without a fine-tuning step, deep learning models have a significant performance gap with comparison to previously reported F-scores on other datasets.

Download Full-text

A Pairwise Probe for Understanding BERT Fine-Tuning on Machine Reading Comprehension

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3397271.3401195 ◽

2020 ◽

Author(s):

Jie Cai ◽

Zhengzhou Zhu ◽

Ping Nie ◽

Qian Liu

Keyword(s):

Reading Comprehension ◽

Fine Tuning ◽

Machine Reading

Download Full-text

Abstract P319: Can Deep Learning Find the Ischemic Core on CT? Transfer Learning From Pre-Trained MRI-Based Networks

Stroke ◽

10.1161/str.52.suppl_1.p319 ◽

2021 ◽

Vol 52 (Suppl_1) ◽

Author(s):

Yannan Yu ◽

Soren Christensen ◽

Yuan Xie ◽

Enhao Gong ◽

Maarten G Lansberg ◽

...

Keyword(s):

Deep Learning ◽

Ground Truth ◽

Learning Model ◽

Fine Tuning ◽

Learning Models ◽

Starting Point ◽

Stroke Lesion ◽

Ischemic Core ◽

Deep Learning Model

Objective: Ischemic core prediction from CT perfusion (CTP) remains inaccurate compared with gold standard diffusion-weighted imaging (DWI). We evaluated if a deep learning model to predict the DWI lesion from MR perfusion (MRP) could facilitate ischemic core prediction on CTP. Method: Using the multi-center CRISP cohort of acute ischemic stroke patient with CTP before thrombectomy, we included patients with major reperfusion (TICI score≥2b), adequate image quality, and follow-up MRI at 3-7 days. Perfusion parameters including Tmax, mean transient time, cerebral blood flow (CBF), and cerebral blood volume were reconstructed by RAPID software. Core lab experts outlined the stroke lesion on the follow-up MRI. A previously trained MRI model in a separate group of patients was used as a starting point, which used MRP parameters as input and RAPID ischemic core on DWI as ground truth. We fine-tuned this model, using CTP parameters as input, and follow-up MRI as ground truth. Another model was also trained from scratch with only CTP data. 5-fold cross validation was used. Performance of the models was compared with ischemic core (rCBF≤30%) from RAPID software to identify the presence of a large infarct (volume>70 or >100ml). Results: 94 patients in the CRISP trial met the inclusion criteria (mean age 67±15 years, 52% male, median baseline NIHSS 18, median 90-day mRS 2). Without fine-tuning, the MRI model had an agreement of 73% in infarct >70ml, and 69% in >100ml; the MRI model fine-tuned on CT improved the agreement to 77% and 73%; The CT model trained from scratch had agreements of 73% and 71%; All of the deep learning models outperformed the rCBF segmentation from RAPID, which had agreements of 51% and 64%. See Table and figure. Conclusions: It is feasible to apply MRP-based deep learning model to CT. Fine-tuning with CTP data further improves the predictions. All deep learning models predict the stroke lesion after major recanalization better than thresholding approaches based on rCBF.

Download Full-text

Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00305 ◽

2020 ◽

Vol 8 ◽

pp. 141-155

Author(s):

Kai Sun ◽

Dian Yu ◽

Dong Yu ◽

Claire Cardie

Keyword(s):

Reading Comprehension ◽

Prior Knowledge ◽

Data Augmentation ◽

Multiple Choice ◽

Model Performance ◽

Free Form ◽

World Knowledge ◽

Domain Specific ◽

Significant Performance ◽

Machine Reading

Machine reading comprehension tasks require a machine reader to answer questions relevant to the given document. In this paper, we present the first free-form multiple-Choice Chinese machine reading Comprehension dataset (C3), containing 13,369 documents (dialogues or more formally written mixed-genre texts) and their associated 19,577 multiple-choice free-form questions collected from Chinese-as-a-second-language examinations. We present a comprehensive analysis of the prior knowledge (i.e., linguistic, domain-specific, and general world knowledge) needed for these real-world problems. We implement rule-based and popular neural methods and find that there is still a significant performance gap between the best performing model (68.5%) and human readers (96.0%), especiallyon problems that require prior knowledge. We further study the effects of distractor plausibility and data augmentation based on translated relevant datasets for English on model performance. We expect C3 to present great challenges to existing systems as answering 86.8% of questions requires both knowledge within and beyond the accompanying document, and we hope that C3 can serve as a platform to study how to leverage various kinds of prior knowledge to better understand a given written or orally oriented text. C3 is available at https://dataset.org/c3/ .

Download Full-text

An Empirical Study on Users' Intention to Pay in B2C Online Education Platform

2020 International Symposium on Educational Technology (ISET) ◽

10.1109/iset49818.2020.00043 ◽

2020 ◽

Author(s):

Xiaodong Zhu ◽

Wenhui Cao ◽

Yafei Wang ◽

Runze Ouyang

Keyword(s):

Online Education ◽

Empirical Study ◽

Education Platform

Download Full-text

Neural Machine Reading Comprehension: Methods and Trends

Applied Sciences ◽

10.3390/app9183698 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3698 ◽

Cited By ~ 9

Author(s):

Shanshan Liu ◽

Xin Zhang ◽

Sheng Zhang ◽

Hui Wang ◽

Weiming Zhang

Keyword(s):

Reading Comprehension ◽

Deep Learning ◽

Research Field ◽

The Past ◽

Learning Techniques ◽

Comprehensive Survey ◽

Recent Trends ◽

General Architecture ◽

Machine Reading ◽

Open Issues

Machine reading comprehension (MRC), which requires a machine to answer questions based on a given context, has attracted increasing attention with the incorporation of various deep-learning techniques over the past few years. Although research on MRC based on deep learning is flourishing, there remains a lack of a comprehensive survey summarizing existing approaches and recent trends, which motivated the work presented in this article. Specifically, we give a thorough review of this research field, covering different aspects including (1) typical MRC tasks: their definitions, differences, and representative datasets; (2) the general architecture of neural MRC: the main modules and prevalent approaches to each; and (3) new trends: some emerging areas in neural MRC as well as the corresponding challenges. Finally, considering what has been achieved so far, the survey also envisages what the future may hold by discussing the open issues left to be addressed.

Download Full-text

Semantics-Aware BERT for Language Understanding

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6510 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9628-9635

Author(s):

Zhuosheng Zhang ◽

Yuwei Wu ◽

Hai Zhao ◽

Zuchao Li ◽

Shuailiang Zhang ◽

...

Keyword(s):

Reading Comprehension ◽

Natural Language ◽

Language Model ◽

Fine Tuning ◽

Semantic Role Labeling ◽

Language Understanding ◽

Context Sensitive ◽

Language Representation ◽

Model Training ◽

Machine Reading

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks. However, the existing language representation models including ELMo, GPT and BERT only exploit plain context-sensitive features such as character or word embeddings. They rarely consider incorporating structured semantic information which can provide rich semantics for language representation. To promote natural language understanding, we propose to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduce an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone. SemBERT keeps the convenient usability of its BERT precursor in a light fine-tuning way without substantial task-specific modifications. Compared with BERT, semantics-aware BERT is as simple in concept but more powerful. It obtains new state-of-the-art or substantially improves results on ten reading comprehension and language inference tasks.

Download Full-text

Medical Image Classification based on an Adaptive Size Deep Learning Model

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3465220 ◽

2021 ◽

Vol 17 (3s) ◽

pp. 1-18

Author(s):

Xiangbin Liu ◽

Jiesheng He ◽

Liping Song ◽

Shuai Liu ◽

Gautam Srivastava

Keyword(s):

Deep Learning ◽

Image Classification ◽

Medical Image ◽

Rapid Development ◽

Learning Model ◽

Fine Tuning ◽

Optimal Size ◽

Image Dataset ◽

Medical Image Classification ◽

Deep Learning Model

With the rapid development of Artificial Intelligence (AI), deep learning has increasingly become a research hotspot in various fields, such as medical image classification. Traditional deep learning models use Bilinear Interpolation when processing classification tasks of multi-size medical image dataset, which will cause the loss of information of the image, and then affect the classification effect. In response to this problem, this work proposes a solution for an adaptive size deep learning model. First, according to the characteristics of the multi-size medical image dataset, the optimal size set module is proposed in combination with the unpooling process. Next, an adaptive deep learning model module is proposed based on the existing deep learning model. Then, the model is fused with the size fine-tuning module used to process multi-size medical images to obtain a solution of the adaptive size deep learning model. Finally, the proposed solution model is applied to the pneumonia CT medical image dataset. Through experiments, it can be seen that the model has strong robustness, and the classification effect is improved by about 4% compared with traditional algorithms.

Download Full-text

Optimization of deep network models through fine tuning

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-06-2017-0070 ◽

2018 ◽

Vol 11 (3) ◽

pp. 386-403 ◽

Cited By ~ 2

Author(s):

M. Arif Wani ◽

Saduf Afzal

Keyword(s):

Experimental Study ◽

Deep Learning ◽

Network Model ◽

Network Models ◽

Fine Tuning ◽

Data Sets ◽

Content Type ◽

Deep Network ◽

Benchmark Data ◽

Deep Learning Model

Purpose Many strategies have been put forward for training deep network models, however, stacking of several layers of non-linearities typically results in poor propagation of gradients and activations. The purpose of this paper is to explore the use of two steps strategy where initial deep learning model is obtained first by unsupervised learning and then optimizing the initial deep learning model by fine tuning. A number of fine tuning algorithms are explored in this work for optimizing deep learning models. This includes proposing a new algorithm where Backpropagation with adaptive gain algorithm is integrated with Dropout technique and the authors evaluate its performance in the fine tuning of the pretrained deep network. Design/methodology/approach The parameters of deep neural networks are first learnt using greedy layer-wise unsupervised pretraining. The proposed technique is then used to perform supervised fine tuning of the deep neural network model. Extensive experimental study is performed to evaluate the performance of the proposed fine tuning technique on three benchmark data sets: USPS, Gisette and MNIST. The authors have tested the approach on varying size data sets which include randomly chosen training samples of size 20, 50, 70 and 100 percent from the original data set. Findings Through extensive experimental study, it is concluded that the two steps strategy and the proposed fine tuning technique significantly yield promising results in optimization of deep network models. Originality/value This paper proposes employing several algorithms for fine tuning of deep network model. A new approach that integrates adaptive gain Backpropagation (BP) algorithm with Dropout technique is proposed for fine tuning of deep networks. Evaluation and comparison of various algorithms proposed for fine tuning on three benchmark data sets is presented in the paper.

Download Full-text

Kudo’s Classification for Colon Polyps Assessment Using a Deep Learning Approach

Applied Sciences ◽

10.3390/app10020501 ◽

2020 ◽

Vol 10 (2) ◽

pp. 501 ◽

Cited By ~ 3

Author(s):

Sebastian Patino-Barrientos ◽

Daniel Sierra-Sosa ◽

Begonya Garcia-Zapirain ◽

Cristian Castillo-Olea ◽

Adel Elmaghraby

Keyword(s):

Deep Learning ◽

Fine Tuning ◽

Learning Approach ◽

Colon Polyps ◽

Cancer Death ◽

Timely Manner ◽

The World ◽

Feature Extractor ◽

The University ◽

Deep Learning Model

Colorectal cancer (CRC) is the second leading cause of cancer death in the world. This disease could begin as a non-cancerous polyp in the colon, when not treated in a timely manner, these polyps could induce cancer, and in turn, death. We propose a deep learning model for classifying colon polyps based on the Kudo’s classification schema, using basic colonoscopy equipment. We train a deep convolutional model with a private dataset from the University of Deusto with and without using a VGG model as a feature extractor, and compared the results. We obtained 83% of accuracy and 83% of F1-score after fine tuning our model with the VGG filter. These results show that deep learning algorithms are useful to develop computer-aided tools for early CRC detection, and suggest combining it with a polyp segmentation model for its use by specialists.

Download Full-text

An Improved Deep Learning Model for Traffic Crash Prediction

Journal of Advanced Transportation ◽

10.1155/2018/3869106 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 14

Author(s):

Chunjiao Dong ◽

Chunfu Shao ◽

Juan Li ◽

Zhihua Xiong

Keyword(s):

Deep Learning ◽

Feature Learning ◽

Learning Model ◽

Fine Tuning ◽

Crash Prediction ◽

Traffic Crash ◽

Explanatory Variables ◽

Feature Representations ◽

Proposed Model ◽

Deep Learning Model

Machine-learning technology powers many aspects of modern society. Compared to the conventional machine learning techniques that were limited in processing natural data in the raw form, deep learning allows computational models to learn representations of data with multiple levels of abstraction. In this study, an improved deep learning model is proposed to explore the complex interactions among roadways, traffic, environmental elements, and traffic crashes. The proposed model includes two modules, an unsupervised feature learning module to identify functional network between the explanatory variables and the feature representations and a supervised fine tuning module to perform traffic crash prediction. To address the unobserved heterogeneity issues in the traffic crash prediction, a multivariate negative binomial (MVNB) model is embedding into the supervised fine tuning module as a regression layer. The proposed model was applied to the dataset that was collected from Knox County in Tennessee to validate the performances. The results indicate that the feature learning module identifies relational information between the explanatory variables and the feature representations, which reduces the dimensionality of the input and preserves the original information. The proposed model that includes the MVNB regression layer in the supervised fine tuning module can better account for differential distribution patterns in traffic crashes across injury severities and provides superior traffic crash predictions. The findings suggest that the proposed model is a superior alternative for traffic crash predictions and the average accuracy of the prediction that was measured by RMSD can be improved by 84.58% and 158.27% compared to the deep learning model without the regression layer and the SVM model, respectively.

Download Full-text