scholarly journals Learning Class-Transductive Intent Representations for Zero-shot Intent Detection

Author(s):  
Qingyi Si ◽  
Yuanxin Liu ◽  
Peng Fu ◽  
Zheng Lin ◽  
Jiangnan Li ◽  
...  

Zero-shot intent detection (ZSID) aims to deal with the continuously emerging intents without annotated training data. However, existing ZSID systems suffer from two limitations: 1) They are not good at modeling the relationship between seen and unseen intents. 2) They cannot effectively recognize unseen intents under the generalized intent detection (GZSID) setting. A critical problem behind these limitations is that the representations of unseen intents cannot be learned in the training stage. To address this problem, we propose a novel framework that utilizes unseen class labels to learn Class-Transductive Intent Representations (CTIR). Specifically, we allow the model to predict unseen intents during training, with the corresponding label names serving as input utterances. On this basis, we introduce a multi-task learning objective, which encourages the model to learn the distinctions among intents, and a similarity scorer, which estimates the connections among intents more accurately. CTIR is easy to implement and can be integrated with existing ZSID and GZSID methods. Experiments on two real-world datasets show that CTIR brings considerable improvement to the baseline systems.

2021 ◽  
Vol 17 (2) ◽  
pp. 1-20
Author(s):  
Zheng Wang ◽  
Qiao Wang ◽  
Tingzhang Zhao ◽  
Chaokun Wang ◽  
Xiaojun Ye

Feature selection, an effective technique for dimensionality reduction, plays an important role in many machine learning systems. Supervised knowledge can significantly improve the performance. However, faced with the rapid growth of newly emerging concepts, existing supervised methods might easily suffer from the scarcity and validity of labeled data for training. In this paper, the authors study the problem of zero-shot feature selection (i.e., building a feature selection model that generalizes well to “unseen” concepts with limited training data of “seen” concepts). Specifically, they adopt class-semantic descriptions (i.e., attributes) as supervision for feature selection, so as to utilize the supervised knowledge transferred from the seen concepts. For more reliable discriminative features, they further propose the center-characteristic loss which encourages the selected features to capture the central characteristics of seen concepts. Extensive experiments conducted on various real-world datasets demonstrate the effectiveness of the method.


2020 ◽  
Vol 34 (01) ◽  
pp. 1153-1160 ◽  
Author(s):  
Xinshi Zang ◽  
Huaxiu Yao ◽  
Guanjie Zheng ◽  
Nan Xu ◽  
Kai Xu ◽  
...  

Using reinforcement learning for traffic signal control has attracted increasing interests recently. Various value-based reinforcement learning methods have been proposed to deal with this classical transportation problem and achieved better performances compared with traditional transportation methods. However, current reinforcement learning models rely on tremendous training data and computational resources, which may have bad consequences (e.g., traffic jams or accidents) in the real world. In traffic signal control, some algorithms have been proposed to empower quick learning from scratch, but little attention is paid to learning by transferring and reusing learned experience. In this paper, we propose a novel framework, named as MetaLight, to speed up the learning process in new scenarios by leveraging the knowledge learned from existing scenarios. MetaLight is a value-based meta-reinforcement learning workflow based on the representative gradient-based meta-learning algorithm (MAML), which includes periodically alternate individual-level adaptation and global-level adaptation. Moreover, MetaLight improves the-state-of-the-art reinforcement learning model FRAP in traffic signal control by optimizing its model structure and updating paradigm. The experiments on four real-world datasets show that our proposed MetaLight not only adapts more quickly and stably in new traffic scenarios, but also achieves better performance.


2007 ◽  
Vol 1 (1) ◽  
pp. 38-45
Author(s):  
Gysber J. Tamaela

Association is a technique in data mining used to identify the relationship between itemsets in a database (association rule). Some researches in association rule since the invention of AIS algorithm in 1993 have yielded several new algorithms. Some of those used artificial datasets (IBM) and claimed by the authors to have a reliable performance in finding maximal frequent itemset. But these datasets have a different characteristics from real world dataset. The goal of this research is to compare the performance of Apriori and Cut Both Ways (CBW) algorithms using 3 real world datasets. We used small and large values of minimum support thresholds as atreatment for each algorithm and datasets. As a result we find that the characteristics of datasets have a signifcant effect on the performance of Apriori and CBW. Support counting strategy, horizontal counting, showed a better performance compared to vertical intersection although candidate frequent itemsets counted was fewer.


Author(s):  
Zhijun Chen ◽  
Huimin Wang ◽  
Hailong Sun ◽  
Pengpeng Chen ◽  
Tao Han ◽  
...  

End-to-end learning from crowds has recently been introduced as an EM-free approach to training deep neural networks directly from noisy crowdsourced annotations. It models the relationship between true labels and annotations with a specific type of neural layer, termed as the crowd layer, which can be trained using pure backpropagation. Parameters of the crowd layer, however, can hardly be interpreted as annotator reliability, as compared with the more principled probabilistic approach. The lack of probabilistic interpretation further prevents extensions of the approach to account for important factors of annotation processes, e.g., instance difficulty. This paper presents SpeeLFC, a structured probabilistic model that incorporates the constraints of probability axioms for parameters of the crowd layer, which allows to explicitly model annotator reliability while benefiting from the end-to-end training of neural networks. Moreover, we propose SpeeLFC-D, which further takes into account instance difficulty. Extensive validation on real-world datasets shows that our methods improve the state-of-the-art.


2022 ◽  
Vol 6 (GROUP) ◽  
pp. 1-25
Author(s):  
Ziyi Kou ◽  
Lanyu Shang ◽  
Yang Zhang ◽  
Dong Wang

The proliferation of social media has promoted the spread of misinformation that raises many concerns in our society. This paper focuses on a critical problem of explainable COVID-19 misinformation detection that aims to accurately identify and explain misleading COVID-19 claims on social media. Motivated by the lack of COVID-19 relevant knowledge in existing solutions, we construct a novel crowdsource knowledge graph based approach to incorporate the COVID-19 knowledge facts by leveraging the collaborative efforts of expert and non-expert crowd workers. Two important challenges exist in developing our solution: i) how to effectively coordinate the crowd efforts from both expert and non-expert workers to generate the relevant knowledge facts for detecting COVID-19 misinformation; ii) How to leverage the knowledge facts from the constructed knowledge graph to accurately explain the detected COVID-19 misinformation. To address the above challenges, we develop HC-COVID, a hierarchical crowdsource knowledge graph based framework that explicitly models the COVID-19 knowledge facts contributed by crowd workers with different levels of expertise and accurately identifies the related knowledge facts to explain the detection results. We evaluate HC-COVID using two public real-world datasets on social media. Evaluation results demonstrate that HC-COVID significantly outperforms state-of-the-art baselines in terms of the detection accuracy of misleading COVID-19 claims and the quality of the explanations.


Data Mining ◽  
2013 ◽  
pp. 125-141
Author(s):  
Fernando Benites ◽  
Elena Sapozhnikova

Methods for the automatic extraction of taxonomies and concept hierarchies from data have recently emerged as essential assistance for humans in ontology construction. The objective of this chapter is to show how the extraction of concept hierarchies and finding relations between them can be effectively coupled with a multi-label classification task. The authors introduce a data mining system which performs classification and addresses both issues by means of association rule mining. The proposed system has been tested on two real-world datasets with the class labels of each dataset coming from two different class hierarchies. Several experiments on hierarchy extraction and concept relation were conducted in order to evaluate the system and three different interestingness measures were applied, to select the most important relations between concepts. One of the measures was developed by the authors. The experimental results showed that the system is able to infer quite accurate concept hierarchies and associations among the concepts. It is therefore well suited for classification-based reasoning.


Author(s):  
Fernando Benites ◽  
Elena Sapozhnikova

Methods for the automatic extraction of taxonomies and concept hierarchies from data have recently emerged as essential assistance for humans in ontology construction. The objective of this chapter is to show how the extraction of concept hierarchies and finding relations between them can be effectively coupled with a multi-label classification task. The authors introduce a data mining system which performs classification and addresses both issues by means of association rule mining. The proposed system has been tested on two real-world datasets with the class labels of each dataset coming from two different class hierarchies. Several experiments on hierarchy extraction and concept relation were conducted in order to evaluate the system and three different interestingness measures were applied, to select the most important relations between concepts. One of the measures was developed by the authors. The experimental results showed that the system is able to infer quite accurate concept hierarchies and associations among the concepts. It is therefore well suited for classification-based reasoning.


Author(s):  
Zequn Sun ◽  
Wei Hu ◽  
Qingheng Zhang ◽  
Yuzhong Qu

Embedding-based entity alignment represents different knowledge graphs (KGs) as low-dimensional embeddings and finds entity alignment by measuring the similarities between entity embeddings. Existing approaches have achieved promising results, however, they are still challenged by the lack of enough prior alignment as labeled training data. In this paper, we propose a bootstrapping approach to embedding-based entity alignment. It iteratively labels likely entity alignment as training data for learning alignment-oriented KG embeddings. Furthermore, it employs an alignment editing method to reduce error accumulation during iterations. Our experiments on real-world datasets showed that the proposed approach significantly outperformed the state-of-the-art embedding-based ones for entity alignment. The proposed alignment-oriented KG embedding, bootstrapping process and alignment editing method all contributed to the performance improvement.


Author(s):  
Sharifah Sakinah Syed Ahmad ◽  
Ezzatul Farhain Azmi ◽  
Fauziah Kasmin ◽  
Zuraini Othman

Even though there are numerous classifiers algorithms that are more complex, k-Nearest Neighbour (k-NN) is regarded as one amongst the most successful approaches to solve real-world issues. The classification process’s effectiveness relies on the training set’s data. However, when k-NN classifier is applied to a real world, various issues could arise; for instance, they are considered to be computationally expensive as the complete training set needs to be stored in the computer for classification of the unseen data. Also, intolerance of k-NN classifier towards irrelevant features can be seen. Conversely, imbalance in the training data could occur wherein considerably larger numbers of data could be seen with some classes versus other classes. Thus, selected training data are employed to improve the effectiveness of k-NN classifier when dealing with large datasets. In this research work, a substitute method is present to enhance data selection by simultaneously clubbing the feature selection as well as instances selection pertaining to k-NN classifier by employing Cooperative Binary Particle Swarm Optimisation (CBPSO). This method can also address the constraint of employing the k-nearest neighbour classifier, particularly when handling high dimensional and imbalance data. A comparison study was performed to demonstrate the performance of our approach by employing 20 real world datasets taken from the UCI Machine Learning Repository. The corresponding table of the classification rate demonstrates the algorithm’s performance. The experimental outcomes exhibit the efficacy of our proposed approach.


Author(s):  
Xiao He ◽  
Francesco Alesiani ◽  
Ammar Shaker

Many real-world large-scale regression problems can be formulated as Multi-task Learning (MTL) problems with a massive number of tasks, as in retail and transportation domains. However, existing MTL methods still fail to offer both the generalization performance and the scalability for such problems. Scaling up MTL methods to problems with a tremendous number of tasks is a big challenge. Here, we propose a novel algorithm, named Convex Clustering Multi-Task regression Learning (CCMTL), which integrates with convex clustering on the k-nearest neighbor graph of the prediction models. Further, CCMTL efficiently solves the underlying convex problem with a newly proposed optimization method. CCMTL is accurate, efficient to train, and empirically scales linearly in the number of tasks. On both synthetic and real-world datasets, the proposed CCMTL outperforms seven state-of-the-art (SoA) multi-task learning methods in terms of prediction accuracy as well as computational efficiency. On a real-world retail dataset with 23,812 tasks, CCMTL requires only around 30 seconds to train on a single thread, while the SoA methods need up to hours or even days.


Sign in / Sign up

Export Citation Format

Share Document