A Quantum Expectation Value Based Language Model with Application to Question Answering

Quantum-inspired language models have been introduced to Information Retrieval due to their transparency and interpretability. While exciting progresses have been made, current studies mainly investigate the relationship between density matrices of difference sentence subspaces of a semantic Hilbert space. The Hilbert space as a whole which has a unique density matrix is lack of exploration. In this paper, we propose a novel Quantum Expectation Value based Language Model (QEV-LM). A unique shared density matrix is constructed for the Semantic Hilbert Space. Words and sentences are viewed as different observables in this quantum model. Under this background, a matching score describing the similarity between a question-answer pair is naturally explained as the quantum expectation value of a joint question-answer observable. In addition to the theoretical soundness, experiment results on the TREC-QA and WIKIQA datasets demonstrate the computational efficiency of our proposed model with excellent performance and low time consumption.

Download Full-text

Question Difficulty Estimation Based on Attention Model for Question Answering

Applied Sciences ◽

10.3390/app112412023 ◽

2021 ◽

Vol 11 (24) ◽

pp. 12023

Author(s):

Hyun-Je Song ◽

Su-Hwan Yoon ◽

Seong-Bae Park

Keyword(s):

Question Answering ◽

Language Model ◽

Difficulty Level ◽

Data Sets ◽

Simple Relationship ◽

Attention Model ◽

Proposed Model ◽

Key Factor ◽

Question Difficulty ◽

Information Components

This paper addresses a question difficulty estimation of which goal is to estimate the difficulty level of a given question in question-answering (QA) tasks. Since a question in the tasks is composed of a questionary sentence and a set of information components such as a description and candidate answers, it is important to model the relationship among the information components to estimate the difficulty level of the question. However, existing approaches to this task modeled a simple relationship such as a relationship between a questionary sentence and a description, but such simple relationships are insufficient to predict the difficulty level accurately. Therefore, this paper proposes an attention-based model to consider the complicated relationship among the information components. The proposed model first represents bi-directional relationships between a questionary sentence and each information component using a dual multi-head co-attention, since the questionary sentence is a key factor in the QA questions and it affects and is affected by information components. Then, the proposed model considers inter-information relationship over the bi-directional representations through a self-attention model. The inter-information relationship helps predict the difficulty of the questions accurately which require reasoning over multiple kinds of information components. The experimental results from three well-known and real-world QA data sets prove that the proposed model outperforms the previous state-of-the-art and pre-trained language model baselines. It is also shown that the proposed model is robust against the increase of the number of information components.

Download Full-text

RDMMFET: Representation of Dense Multimodality Fusion Encoder Based on Transformer

Mobile Information Systems ◽

10.1155/2021/2662064 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Xu Zhang ◽

DeZhi Han ◽

Chin-Chen Chang

Keyword(s):

Question Answering ◽

Language Model ◽

Visual Images ◽

Text And Image ◽

Data Set ◽

Fine Grained ◽

Natural Language Question ◽

The Relationship ◽

Language Question ◽

Better Than

Visual question answering (VQA) is the natural language question-answering of visual images. The model of VQA needs to make corresponding answers according to specific questions based on understanding images, the most important of which is to understand the relationship between images and language. Therefore, this paper proposes a new model, Representation of Dense Multimodality Fusion Encoder Based on Transformer, for short, RDMMFET, which can learn the related knowledge between vision and language. The RDMMFET model consists of three parts: dense language encoder, image encoder, and multimodality fusion encoder. In addition, we designed three types of pretraining tasks: masked language model, masked image model, and multimodality fusion task. These pretraining tasks can help to understand the fine-grained alignment between text and image regions. Simulation results on the VQA v2.0 data set show that the RDMMFET model can work better than the previous model. Finally, we conducted detailed ablation studies on the RDMMFET model and provided the results of attention visualization, which proves that the RDMMFET model can significantly improve the effect of VQA.

Download Full-text

BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection

PLoS ONE ◽

10.1371/journal.pone.0257130 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0257130

Author(s):

Yang Li ◽

Yuqing Sun ◽

Nana Zhu

Keyword(s):

Language Model ◽

Language Models ◽

Limited Resources ◽

Teacher Language ◽

Proposed Model ◽

Knowledge Distillation ◽

Similarity Preserving ◽

The One ◽

Chinese And English ◽

Text Sentiment Analysis

In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author’s stance (favor or against) towards a specific target or proposition in the text. Pre-trained language models like BERT have been proven to perform well in this task. However, in many reality scenes, they are usually very expensive in computation, because such heavy models are difficult to implement with limited resources. To improve the efficiency while ensuring the performance, we propose a knowledge distillation model BERTtoCNN, which combines the classic distillation loss and similarity-preserving loss in a joint knowledge distillation framework. On the one hand, BERTtoCNN provides an efficient distillation process to train a novel ‘student’ CNN structure from a much larger ‘teacher’ language model BERT. On the other hand, based on the similarity-preserving loss function, BERTtoCNN guides the training of a student network, so that input pairs with similar (dissimilar) activation in the teacher network have similar (dissimilar) activation in the student network. We conduct experiments and test the proposed model on the open Chinese and English stance detection datasets. The experimental results show that our model outperforms the competitive baseline methods obviously.

Download Full-text

KM-BERT: A Pre-trained BERT for Korean Medical Natural Language Processing (Preprint)

10.2196/preprints.31223 ◽

2021 ◽

Author(s):

Yoojoong Kim ◽

Jeong Moon Lee ◽

Moon Joung Jang ◽

Yun Jin Yum ◽

Jong-Ho Kim ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Pearson Correlation ◽

Language Model ◽

Language Models ◽

Korean Language ◽

Medical Texts ◽

Proposed Model

BACKGROUND With advances in deep learning and natural language processing, analyzing medical texts is becoming increasingly important. Nonetheless, a study on medical-specific language models has not yet been conducted given the importance of medical texts. OBJECTIVE Korean medical text is highly difficult to analyze because of the agglutinative characteristics of the language as well as the complex terminologies in the medical domain. To solve this problem, we collected a Korean medical corpus and used it to train language models. METHODS In this paper, we present a Korean medical language model based on deep learning natural language processing. The proposed model was trained using the pre-training framework of BERT for the medical context based on a state-of-the-art Korean language model. RESULTS After pre-training, the proposed method showed increased accuracies of 0.147 and 0.148 for the masked language model with next sentence prediction. In the intrinsic evaluation, the next sentence prediction accuracy improved by 0.258, which is a remarkable enhancement. In addition, the extrinsic evaluation of Korean medical semantic textual similarity data showed a 0.046 increase in the Pearson correlation. CONCLUSIONS The results demonstrated the superiority of the proposed model for Korean medical natural language processing. We expect that our proposed model can be extended for application to various languages and domains.

Download Full-text

You Don’t Need Labeled Data for Open-Book Question Answering

Applied Sciences ◽

10.3390/app12010111 ◽

2021 ◽

Vol 12 (1) ◽

pp. 111

Author(s):

Sia Gholami ◽

Mehdi Noori

Keyword(s):

Question Answering ◽

Common Knowledge ◽

Language Model ◽

Language Models ◽

Open Book ◽

Technical Documentation ◽

Domain Specific ◽

Information Retrieval Systems ◽

Amazon Web Services ◽

The Right

Open-book question answering is a subset of question answering (QA) tasks where the system aims to find answers in a given set of documents (open-book) and common knowledge about a topic. This article proposes a solution for answering natural language questions from a corpus of Amazon Web Services (AWS) technical documents with no domain-specific labeled data (zero-shot). These questions have a yes–no–none answer and a text answer which can be short (a few words) or long (a few sentences). We present a two-step, retriever–extractor architecture in which a retriever finds the right documents and an extractor finds the answers in the retrieved documents. To test our solution, we are introducing a new dataset for open-book QA based on real customer questions on AWS technical documentation. In this paper, we conducted experiments on several information retrieval systems and extractive language models, attempting to find the yes–no–none answers and text answers in the same pass. Our custom-built extractor model is created from a pretrained language model and fine-tuned on the the Stanford Question Answering Dataset—SQuAD and Natural Questions datasets. We were able to achieve 42% F1 and 39% exact match score (EM) end-to-end with no domain-specific training.

Download Full-text

The Influence of Cognitive and Affective Reactions to Feedback on Subsequent Goals

European Psychologist ◽

10.1027/1016-9040/a000011 ◽

2010 ◽

Vol 15 (2) ◽

pp. 121-131 ◽

Cited By ~ 20

Author(s):

Remus Ilies ◽

Timothy A. Judge ◽

David T. Wagner

Keyword(s):

Performance Feedback ◽

Business School ◽

Structural Equation ◽

Emotional Reactions ◽

Equation Model ◽

Proposed Model ◽

And Performance ◽

Dispositional Measure ◽

The Relationship ◽

Reactions To Feedback

This paper focuses on explaining how individuals set goals on multiple performance episodes, in the context of performance feedback comparing their performance on each episode with their respective goal. The proposed model was tested through a longitudinal study of 493 university students’ actual goals and performance on business school exams. Results of a structural equation model supported the proposed conceptual model in which self-efficacy and emotional reactions to feedback mediate the relationship between feedback and subsequent goals. In addition, as expected, participants’ standing on a dispositional measure of behavioral inhibition influenced the strength of their emotional reactions to negative feedback.

Download Full-text

Adolescent Language: Models, Assessment, and Links to Reading

10.35542/osf.io/pf5y8 ◽

2019 ◽

Cited By ~ 1

Author(s):

Amanda Goodwin ◽

Yaacov Petscher ◽

Jamie Tock

Keyword(s):

Reading Comprehension ◽

Bifactor Model ◽

Language Models ◽

Multiple Group ◽

Global Factor ◽

Eighth Grade Students ◽

Key Aspects ◽

Future Work ◽

The Relationship ◽

Best Fit

Various models have highlighted the complexity of language. Building on foundational ideas regarding three key aspects of language, our study contributes to the literature by 1) exploring broader conceptions of morphology, vocabulary, and syntax, 2) operationalizing this theoretical model into a gamified, standardized, computer-adaptive assessment of language for fifth to eighth grade students entitled Monster, PI, and 3) uncovering further evidence regarding the relationship between language and standardized reading comprehension via this assessment. Multiple-group item response theory (IRT) across grades show that morphology was best fit by a bifactor model of task specific factors along with a global factor related to each skill. Vocabulary was best fit by a bifactor model that identifies performance overall and on specific words. Syntax, though, was best fit by a unidimensional model. Next, Monster, PI produced reliable scores suggesting language can be assessed efficiently and precisely for students via this model. Lastly, performance on Monster, PI explained more than 50% of variance in standardized reading, suggesting operationalizing language via Monster, PI can provide meaningful understandings of the relationship between language and reading comprehension. Specifically, considering just a subset of a construct, like identification of units of meaning, explained significantly less variance in reading comprehension. This highlights the importance of considering these broader constructs. Implications indicate that future work should consider a model of language where component areas are considered broadly and contributions to reading comprehension are explored via general performance on components as well as skill level performance.

Download Full-text

Hybrid Deep Neural Model for Duplicate Question Detection in Transliterated Bi-lingual Data

Recent Patents on Computer Science ◽

10.2174/2213275912666190710152709 ◽

2019 ◽

Vol 12 ◽

Author(s):

Seema Rani ◽

Avadhesh Kumar ◽

Naresh Kumar

Keyword(s):

Question Answering ◽

Neural Model ◽

Manhattan Distance ◽

Semantic Matching ◽

English Only ◽

Detection Model ◽

Proposed Model ◽

Reported Study ◽

Social Media Platforms ◽

Efficient Information

Background: Duplicate content often corrupts the filtering mechanism in online question answering. Moreover, as users are usually more comfortable conversing in their native language questions, transliteration adds to the challenges in detecting duplicate questions. This compromises with the response time and increases the answer overload. Thus, it has now become crucial to build clever, intelligent and semantic filters which semantically match linguistically disparate questions. Objective: Most of the research on duplicate question detection has been done on mono-lingual, majorly English Q&A platforms. The aim is to build a model which extends the cognitive capabilities of machines to interpret, comprehend and learn features for semantic matching in transliterated bi-lingual Hinglish (Hindi + English) data acquired from different Q&A platforms. Method: In the proposed DQDHinglish (Duplicate Question Detection) Model, firstly language transformation (transliteration & translation) is done to convert the bi-lingual transliterated question into a mono-lingual English only text. Next a hybrid of Siamese neural network containing two identical Long-term-Short-memory (LSTM) models and Multi-layer perceptron network is proposed to detect semantically similar question pairs. Manhattan distance function is used as the similarity measure. Result: A dataset was prepared by scrapping 100 question pairs from various social media platforms, such as Quora and TripAdvisor. The performance of the proposed model on the basis of accuracy and F-score. The proposed DQDHinglish achieves a validation accuracy of 82.40%. Conclusion: A deep neural model was introduced to find semantic match between English question and a Hinglish (Hindi + English) question such that similar intent questions can be combined to enable fast and efficient information processing and delivery. A dataset was created and the proposed model was evaluated on the basis of performance accuracy. To the best of our knowledge, this work is the first reported study on transliterated Hinglish semantic question matching.

Download Full-text

Local Spectral Properties Under Conjugations

Mediterranean Journal of Mathematics ◽

10.1007/s00009-021-01731-7 ◽

2021 ◽

Vol 18 (3) ◽

Author(s):

Pietro Aiena ◽

Fabio Burderi ◽

Salvatore Triolo

Keyword(s):

Hilbert Space ◽

Spectral Properties ◽

Operator Matrices ◽

Weyl Type Theorems ◽

Triangular Operator ◽

Analytic Core ◽

Complex Symmetric ◽

Symmetric Operators ◽

The Relationship

AbstractIn this paper, we study some local spectral properties of operators having form JTJ, where J is a conjugation on a Hilbert space H and $$T\in L(H)$$ T ∈ L ( H ) . We also study the relationship between the quasi-nilpotent part of the adjoint $$T^*$$ T ∗ and the analytic core K(T) in the case of decomposable complex symmetric operators. In the last part we consider Weyl type theorems for triangular operator matrices for which one of the entries has form JTJ, or has form $$JT^*J$$ J T ∗ J . The theory is exemplified in some concrete cases.

Download Full-text

Metacognitive control processes in question answering: help seeking and withholding answers

Metacognition and Learning ◽

10.1007/s11409-021-09259-7 ◽

2021 ◽

Author(s):

Monika Undorf ◽

Iris Livneh ◽

Rakefet Ackerman

Keyword(s):

Help Seeking ◽

Question Answering ◽

Monetary Incentives ◽

Metacognitive Control ◽

Self Regulated Learning ◽

Response Strategies ◽

Seeking Help ◽

Subjective Confidence ◽

Different Populations ◽

The Relationship

AbstractWhen responding to knowledge questions, people monitor their confidence in the knowledge they retrieve from memory and strategically regulate their responses so as to provide answers that are both correct and informative. The current study investigated the association between subjective confidence and the use of two response strategies: seeking help and withholding answers by responding “I don’t know”. Seeking help has been extensively studied as a resource management strategy in self-regulated learning, but has been largely neglected in metacognition research. In contrast, withholding answers has received less attention in educational studies than in metacognition research. Across three experiments, we compared the relationship between subjective confidence and strategy use in conditions where participants could choose between submitting answers and seeking help, between submitting and withholding answers, or between submitting answers, seeking help, and withholding answers. Results consistently showed that the association between confidence and help seeking was weaker than that between confidence and withholding answers. This difference was found for participants from two different populations, remained when participants received monetary incentives for accurate answers, and replicated across two forms of help. Our findings suggest that seeking help is guided by a wider variety of considerations than withholding answers, with some considerations going beyond improving the immediate accuracy of one’s answers. We discuss implications for research on metacognition and regarding question answering in educational and other contexts.

Download Full-text