scholarly journals Multi-Point Semantic Representation for Intent Classification

2020 ◽  
Vol 34 (05) ◽  
pp. 9531-9538
Author(s):  
Jinghan Zhang ◽  
Yuxiao Ye ◽  
Yue Zhang ◽  
Likun Qiu ◽  
Bin Fu ◽  
...  

Detecting user intents from utterances is the basis of natural language understanding (NLU) task. To understand the meaning of utterances, some work focuses on fully representing utterances via semantic parsing in which annotation cost is labor-intentsive. While some researchers simply view this as intent classification or frequently asked questions (FAQs) retrieval, they do not leverage the shared utterances among different intents. We propose a simple and novel multi-point semantic representation framework with relatively low annotation cost to leverage the fine-grained factor information, decomposing queries into four factors, i.e., topic, predicate, object/condition, query type. Besides, we propose a compositional intent bi-attention model under multi-task learning with three kinds of attention mechanisms among queries, labels and factors, which jointly combines coarse-grained intent and fine-grained factor information. Extensive experiments show that our framework and model significantly outperform several state-of-the-art approaches with an improvement of 1.35%-2.47% in terms of accuracy.

Author(s):  
Wentao Ding ◽  
Guanji Gao ◽  
Linfeng Shi ◽  
Yuzhong Qu

Recognizing time expressions is a fundamental and important task in many applications of natural language understanding, such as reading comprehension and question answering. Several newest state-of-the-art approaches have achieved good performance on recognizing time expressions. These approaches are black-boxed or based on heuristic rules, which leads to the difficulty in understanding the temporal information. On the contrary, classic rule-based or semantic parsing approaches can capture rich structural information, but their performances on recognition are not so good. In this paper, we propose a pattern-based approach, called PTime, which automatically generates and selects patterns for recognizing time expressions. In this approach, time expressions in training text are abstracted into type sequences by using fine-grained token types, thus the problem is transformed to select an appropriate subset of the sequential patterns. We use the Extended Budgeted Maximum Coverage (EBMC) model to optimize the pattern selection. The main idea is to maximize the correct token sequences matched by the selected patterns while the number of the mistakes should be limited by an adjustable budget. The interpretability of patterns and the adjustability of permitted number of mistakes make PTime a very promising approach for many applications. Experimental results show that PTime achieves a very competitive performance as compared with existing state-of-the-art approaches.


Author(s):  
Siva Reddy ◽  
Mirella Lapata ◽  
Mark Steedman

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.


2021 ◽  
Vol 9 ◽  
pp. 929-944
Author(s):  
Omar Khattab ◽  
Christopher Potts ◽  
Matei Zaharia

Abstract Systems for Open-Domain Question Answering (OpenQA) generally depend on a retriever for finding candidate passages in a large corpus and a reader for extracting answers from those passages. In much recent work, the retriever is a learned component that uses coarse-grained vector representations of questions and passages. We argue that this modeling choice is insufficiently expressive for dealing with the complexity of natural language questions. To address this, we define ColBERT-QA, which adapts the scalable neural retrieval model ColBERT to OpenQA. ColBERT creates fine-grained interactions between questions and passages. We propose an efficient weak supervision strategy that iteratively uses ColBERT to create its own training data. This greatly improves OpenQA retrieval on Natural Questions, SQuAD, and TriviaQA, and the resulting system attains state-of-the-art extractive OpenQA performance on all three datasets.


Author(s):  
Siying Wu ◽  
Zheng-Jun Zha ◽  
Zilei Wang ◽  
Houqiang Li ◽  
Feng Wu

Image paragraph generation aims to describe an image with a paragraph in natural language. Compared to image captioning with a single sentence, paragraph generation provides more expressive and fine-grained description for storytelling. Existing approaches mainly optimize paragraph generator towards minimizing word-wise cross entropy loss, which neglects linguistic hierarchy of paragraph and results in ``sparse" supervision for generator learning. In this paper, we propose a novel Densely Supervised Hierarchical Policy-Value (DHPV) network for effective paragraph generation. We design new hierarchical supervisions consisting of hierarchical rewards and values at both sentence and word levels. The joint exploration of hierarchical rewards and values provides dense supervision cues for learning effective paragraph generator. We propose a new hierarchical policy-value architecture which exploits compositionality at token-to-token and sentence-to-sentence levels simultaneously and can preserve the semantic and syntactic constituent integrity. Extensive experiments on the Stanford image-paragraph benchmark have demonstrated the effectiveness of the proposed DHPV approach with performance improvements over multiple state-of-the-art methods.


2022 ◽  
Vol 22 (3) ◽  
pp. 1-21
Author(s):  
Prayag Tiwari ◽  
Amit Kumar Jaiswal ◽  
Sahil Garg ◽  
Ilsun You

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.


Author(s):  
Yu Gong ◽  
Xusheng Luo ◽  
Yu Zhu ◽  
Wenwu Ou ◽  
Zhao Li ◽  
...  

Slot filling is a critical task in natural language understanding (NLU) for dialog systems. State-of-the-art approaches treat it as a sequence labeling problem and adopt such models as BiLSTM-CRF. While these models work relatively well on standard benchmark datasets, they face challenges in the context of E-commerce where the slot labels are more informative and carry richer expressions. In this work, inspired by the unique structure of E-commerce knowledge base, we propose a novel multi-task model with cascade and residual connections, which jointly learns segment tagging, named entity tagging and slot filling. Experiments show the effectiveness of the proposed cascade and residual structures. Our model has a 14.6% advantage in F1 score over the strong baseline methods on a new Chinese E-commerce shopping assistant dataset, while achieving competitive accuracies on a standard dataset. Furthermore, online test deployed on such dominant E-commerce platform shows 130% improvement on accuracy of understanding user utterances. Our model has already gone into production in the E-commerce platform.


Author(s):  
Kyung-Min Kim ◽  
Min-Oh Heo ◽  
Seong-Ho Choi ◽  
Byoung-Tak Zhang

Question-answering (QA) on video contents is a significant challenge for achieving human-level intelligence as it involves both vision and language in real-world settings. Here we demonstrate the possibility of an AI agent performing video story QA by learning from a large amount of cartoon videos. We develop a video-story learning model, i.e. Deep Embedded Memory Networks (DEMN), to reconstruct stories from a joint scene-dialogue video stream using a latent embedding space of observed data. The video stories are stored in a long-term memory component. For a given question, an LSTM-based attention model uses the long-term memory to recall the best question-story-answer triplet by focusing on specific words containing key information. We trained the DEMN on a novel QA dataset of children’s cartoon video series, Pororo. The dataset contains 16,066 scene-dialogue pairs of 20.5-hour videos, 27,328 fine-grained sentences for scene description, and 8,913 story-related QA pairs. Our experimental results show that the DEMN outperforms other QA models. This is mainly due to 1) the reconstruction of video stories in a scene-dialogue combined form that utilize the latent embedding and 2) attention. DEMN also achieved state-of-the-art results on the MovieQA benchmark.


2020 ◽  
Author(s):  
Yiqin Luo ◽  
Yanpeng Sun ◽  
Liang Chang ◽  
Tianlong Gu ◽  
Chenzhong Bin ◽  
...  

Abstract In context-aware recommendation systems, most existing methods encode users’ preferences by mapping item and category information into the same space, which is just a stack of information. The item and category information contained in the interaction behaviours is not fully utilized. Moreover, since users’ preferences for a candidate item are influenced by the changes in temporal and historical behaviours, it is unreasonable to predict correlations between users and candidates by using users’ fixed features. A fine-grained and coarse-grained information based framework proposed in our paper which considers multi-granularity information of users’ historical behaviours. First, a parallel structure is provided to mine users’ preference information under different granularities. Then, self-attention and attention mechanisms are used to capture the dynamic preferences. Experiment results on two publicly available datasets show that our framework outperforms state-of-the-art methods across the calculated evaluation metrics.


Author(s):  
Ichiro Kobayashi ◽  

At the annual conference of the Japan Society for Artificial Intelligence (JSAI), a special survival session called "Challenge for Realizing Early Profits (CREP)" is organized to support and promote excellent ideas in new AI technologies expected to be realized and contributed to society within five years. Every year at the session, researchers propose their ideas and compete in being evaluated by conference participants. The Everyday Language Computing (ELC) project, started in 2000 at the Brain Science Institute, RIKEN, and ended in 2005, participated in the CREP program in 2001 to have their project evaluated by third parties and held an organized session every year in which those interested in language-based intelligence and personalization participate. They competed with other candidates, survived the session, and achieved the session's final goal to survive for five years. Papers in this special issue selected for presentation at the session include the following: The first article, "Everyday-Language Computing Project Overview," by Ichiro Kobayashi et al., gives an overview and the basic technologies of the ELC Project. The second to sixth papers are related to the ELC Project. The second article, "Computational Models of Language Within Context and Context-Sensitive Language Understanding," by Noriko Ito et al., proposes a new database, called the "semiotic base," that compiles linguistic resources with contextual information and an algorithm for achieving natural language understanding with the semiotic base. The third article, "Systemic-Functional Context-Sensitive Text Generation in the Framework of Everyday Language Computing," by Yusuke Takahashi et al., proposes an algorithm to generate texts with the semiotic base. The fourth article, "Natural Language-Mediated Software Agentification," by Michiaki Iwazume et al., proposes a method for agentifying and verbalizing existing software applications, together with a scheme for operating/running them. The fifth article, "Smart Help for Novice Users Based on Application Software Manuals," by Shino Iwashita et al., proposes a new framework for reusing electronic software manuals equipped with application software to provide tailor-made operation instructions to users. The sixth article, "Programming in Everyday Language: A Case for Email Management," by Toru Sugimoto et al., making a computer program written in natural language. Rhetorical structure analysis is used to translate the natural language command structure into the program structure. The seventh article, "Application of Paraphrasing to Programming with Linguistic Expressions," by Nozomu Kaneko et al., proposes a method for translating natural language commands into a computer program through a natural language paraphrasing mechanism. The eighth article, "A Human Interface Based on Linguistic Metaphor and Intention Reasoning," by Koichi Yamada et al., proposes a new human interface paradigm called Push Like Talking (PLT), which enables people to operate machines as they talk. The ninth article, "Automatic Metadata Annotation Based on User Preference Evaluation Patterns," by Mari Saito proposes effective automatic metadata annotation for content recommendations matched to user preference. The tenth article, "Dynamic Sense Representation Using Conceptual Fuzzy Sets," by Hiroshi Sekiya et al., proposes a method to represent word senses, which vary dynamically depending on context, using conceptual fuzzy sets. The eleventh article, "Common Sense from the Web? Naturalness of Everyday Knowledge Retrieved from WWW," by Rafal Rzepka et al., is a challenging work to acquire common-sense knowledge from information on the Web. The twelfth article, "Semantic Representation for Understanding Meaning Based on Correspondence Between Meanings," by Akira Takagi et al., proposes a new semantic representation to deal with Japanese language in natural language processing. I thank the reviewers and contributors for their time and effort in making this special issue possible, and I wish to thank the JACIII editorial board, especially Professors Kaoru Hirota and Toshio Fukuda, the Editors-in-Chief, for inviting me to serve as Guest Editor of this Journal. Thanks also go to Kazuki Ohmori and Kenta Uchino of Fuji Technology Press for their sincere support.


2020 ◽  
Vol 34 (05) ◽  
pp. 8713-8721
Author(s):  
Kyle Richardson ◽  
Hai Hu ◽  
Lawrence Moss ◽  
Ashish Sabharwal

Do state-of-the-art models for language understanding already have, or can they easily learn, abilities such as boolean coordination, quantification, conditionals, comparatives, and monotonicity reasoning (i.e., reasoning about word substitutions in sentential contexts)? While such phenomena are involved in natural language inference (NLI) and go beyond basic linguistic understanding, it is unclear the extent to which they are captured in existing NLI benchmarks and effectively learned by models. To investigate this, we propose the use of semantic fragments—systematically generated datasets that each target a different semantic phenomenon—for probing, and efficiently improving, such capabilities of linguistic models. This approach to creating challenge datasets allows direct control over the semantic diversity and complexity of the targeted linguistic phenomena, and results in a more precise characterization of a model's linguistic behavior. Our experiments, using a library of 8 such semantic fragments, reveal two remarkable findings: (a) State-of-the-art models, including BERT, that are pre-trained on existing NLI benchmark datasets perform poorly on these new fragments, even though the phenomena probed here are central to the NLI task; (b) On the other hand, with only a few minutes of additional fine-tuning—with a carefully selected learning rate and a novel variation of “inoculation”—a BERT-based model can master all of these logic and monotonicity fragments while retaining its performance on established NLI benchmarks.


Sign in / Sign up

Export Citation Format

Share Document