scholarly journals Concept Extraction and Prerequisite Relation Learning from Educational Data

Author(s):  
Weiming Lu ◽  
Yangfan Zhou ◽  
Jiale Yu ◽  
Chenhao Jia

Prerequisite relations among concepts are crucial for educational applications. However, it is difficult to automatically extract domain-specific concepts and learn the prerequisite relations among them without labeled data.In this paper, we first extract high-quality phrases from a set of educational data, and identify the domain-specific concepts by a graph based ranking method. Then, we propose an iterative prerequisite relation learning framework, called iPRL, which combines a learning based model and recovery based model to leverage both concept pair features and dependencies among learning materials. In experiments, we evaluated our approach on two real-world datasets Textbook Dataset and MOOC Dataset, and validated that our approach can achieve better performance than existing methods. Finally, we also illustrate some examples of our approach.

Author(s):  
Hai-Feng Guo ◽  
Lixin Han ◽  
Shoubao Su ◽  
Zhou-Bao Sun

Multi-Instance Multi-Label learning (MIML) is a popular framework for supervised classification where an example is described by multiple instances and associated with multiple labels. Previous MIML approaches have focused on predicting labels for instances. The idea of tackling the problem is to identify its equivalence in the traditional supervised learning framework. Motivated by the recent advancement in deep learning, in this paper, we still consider the problem of predicting labels and attempt to model deep learning in MIML learning framework. The proposed approach enables us to train deep convolutional neural network with images from social networks where images are well labeled, even labeled with several labels or uncorrelated labels. Experiments on real-world datasets demonstrate the effectiveness of our proposed approach.


Author(s):  
Lile Li ◽  
Quan Do ◽  
Wei Liu

Data across many business domains can be represented by two or more coupled data sets. Correlations among these coupled datasets have been studied in the literature for making more accurate cross-domain recommender systems. However, in existing methods, cross-domain recommendations mostly assume the coupled mode of data sets share identical latent factors, which limits the discovery of potentially useful domain-specific properties of the original data. In this paper, we proposed a novel cross-domain recommendation method called Coupled Factorization Machine (CoFM) that addresses this limitation. Compared to existing models, our research is the first model that uses factorization machines to capture both common characteristics of coupled domains while simultaneously preserving the differences among them. Our experiments with real-world datasets confirm the advantages of our method in making across-domain recommendations.


Author(s):  
Lei Guo ◽  
Li Tang ◽  
Tong Chen ◽  
Lei Zhu ◽  
Quoc Viet Hung Nguyen ◽  
...  

Shared-account Cross-domain Sequential Recommendation (SCSR) is the task of recommending the next item based on a sequence of recorded user behaviors, where multiple users share a single account, and their behaviours are available in multiple domains. Existing work on solving SCSR mainly relies on mining sequential patterns via RNN-based models, which are not expressive enough to capture the relationships among multiple entities. Moreover, all existing algorithms try to bridge two domains via knowledge transfer in the latent space, and the explicit cross-domain graph structure is unexploited. In this work, we propose a novel graph-based solution, namely DA-GCN, to address the above challenges. Specifically, we first link users and items in each domain as a graph. Then, we devise a domain-aware graph convolution network to learn user-specific node representations. To fully account for users' domain-specific preferences on items, two novel attention mechanisms are further developed to selectively guide the message passing process. Extensive experiments on two real-world datasets are conducted to demonstrate the superiority of our DA-GCN method.


Author(s):  
Chun-Hsiang Wang ◽  
Kang-Chun Fan ◽  
Chuan-Ju Wang ◽  
Ming-Feng Tsai

Customer reviews on platforms such as TripAdvisor and Amazon provide rich information about the ways that people convey sentiment on certain domains. Given these kinds of user reviews, this paper proposes UGSD, a representation learning framework for constructing domain-specific sentiment dictionaries from online customer reviews, in which we leverage the relationship between user-generated reviews and the ratings of the reviews to associate the reviewer sentiment with certain entities. The proposed framework has the following three main advantages. First, no additional annotations of words or external dictionaries are needed for the proposed framework; the only resources needed are the review texts and entity ratings. Second, the framework is applicable across a variety of user-generated content from different domains to construct domain-specific sentiment dictionaries. Finally, each word in the constructed dictionary is associated with a low-dimensional dense representation and a degree of relatedness to a certain rating, which enable us to obtain more fine-grained dictionaries and enhance the application scalability of the constructed dictionaries as the word representations can be adopted for various tasks or applications, such as entity ranking and dictionary expansion. The experimental results on three real-world datasets show that the framework is effective in constructing high-quality domain-specific sentiment dictionaries from customer reviews.


Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 964
Author(s):  
Aïssatou Diallo ◽  
Johannes Fürnkranz

Ordinal embedding is the task of computing a meaningful multidimensional representation of objects, for which only qualitative constraints on their distance functions are known. In particular, we consider comparisons of the form “Which object from the pair (j, k) is more similar to object i?”. In this paper, we generalize this framework to the case where the ordinal constraints are not given at the level of individual points, but at the level of sets, and propose a distributional triplet embedding approach in a scalable learning framework. We show that the query complexity of our approach is on par with the single-item approach. Without having access to features of the items to be embedded, we show the applicability of our model on toy datasets for the task of reconstruction and demonstrate the validity of the obtained embeddings in experiments on synthetic and real-world datasets.


Author(s):  
Adrian Brown ◽  
Christian Borgs ◽  
Sean Randall ◽  
Rainer Schnell

ABSTRACT ObjectivesAs privacy-preserving record linkage (PPRL) emerges as a method for linking sensitive data, efficient blocking techniques that help maintain high levels of linkage quality are required. This research looks at the use of a Q-gram Fingerprinting blocking technique, with Multibit Trees, and applies this method to real-world datasets. ApproachData comprised ten years of hospital and mortality records from several Australian states, totalling over 25 million records. Each record contained a linkage key, as defined by the jurisdiction, which was used to assess quality (i.e. used as a ‘gold standard’). Different parameter sets were defined for the linkage tests with a privacy-preserved file created for each parameter set. The files contained jurisdictional linkage key and a Cryptographic Long-term Key (the CLK is a Bloom filter comprising all fields in the parameter set). Each file was run through an implementation of the Q-gram Fingerprinting blocking algorithm as a deduplication technique, using different similarity thresholds. The quality metrics of precision, recall and f-measure were calculated. ResultsResultant quality varied for each parameter set. Adding suburb and postcode reduced the linkage quality. The best parameter set returned an F-measure of 0.951. In general, precision was high in all settings, but recall fell as more fields were added to the CLK. We will report details for all parameter settings and their corresponding results. ConclusionThe Q-gram Fingerprinting blocking technique shows promise for maintaining high quality linkage in reasonable time. Determining which fields to include in the CLK for the linkage of specific datasets is important to maximise linkage quality, as well as selecting optimal similarity thresholds. Developing new technology is important for progressing the implementation of PPRL in real-world settings.


Author(s):  
Lu Zhang ◽  
Zhu Sun ◽  
Jie Zhang ◽  
Yu Lei ◽  
Chen Li ◽  
...  

Studies on next point-of-interest (POI) recommendation mainly seek to learn users' transition patterns with certain historical check-ins. However, in reality, users' movements are typically uncertain (i.e., fuzzy and incomplete) where most existing methods suffer from the transition pattern vanishing issue. To ease this issue, we propose a novel interactive multi-task learning (iMTL) framework to better exploit the interplay between activity and location preference. Specifically, iMTL introduces: (1) temporal-aware activity encoder equipped with fuzzy characterization over uncertain check-ins to unveil the latent activity transition patterns; (2) spatial-aware location preference encoder to capture the latent location transition patterns; and (3) task-specific decoder to make use of the learned latent transition patterns and enhance both activity and location prediction tasks in an interactive manner. Extensive experiments on three real-world datasets show the superiority of iMTL.


2017 ◽  
Vol 93 (4) ◽  
pp. 177-202 ◽  
Author(s):  
Emily E. Griffith

ABSTRACT Auditors are more likely to identify misstatements in complex estimates if they recognize problematic patterns among an estimate's underlying assumptions. Rich problem representations aid pattern recognition, but auditors likely have difficulty developing them given auditors' limited domain-specific expertise in this area. In two experiments, I predict and find that a relational cue in a specialist's work highlighting aggressive assumptions improves auditors' problem representations and subsequent judgments about estimates. However, this improvement only occurs when a situational factor (e.g., risk) increases auditors' epistemic motivation to incorporate the cue into their problem representations. These results suggest that auditors do not always respond to cues in specialists' work. More generally, this study highlights the role of situational factors in increasing auditors' epistemic motivation to develop rich problem representations, which contribute to high-quality audit judgments in this and other domains where pattern recognition is important.


2021 ◽  
Vol 37 (1) ◽  
pp. 635-656
Author(s):  
Farzana Anowar ◽  
Samira Sadaoui

2021 ◽  
Vol 21 (3) ◽  
pp. 1-17
Author(s):  
Wu Chen ◽  
Yong Yu ◽  
Keke Gai ◽  
Jiamou Liu ◽  
Kim-Kwang Raymond Choo

In existing ensemble learning algorithms (e.g., random forest), each base learner’s model needs the entire dataset for sampling and training. However, this may not be practical in many real-world applications, and it incurs additional computational costs. To achieve better efficiency, we propose a decentralized framework: Multi-Agent Ensemble. The framework leverages edge computing to facilitate ensemble learning techniques by focusing on the balancing of access restrictions (small sub-dataset) and accuracy enhancement. Specifically, network edge nodes (learners) are utilized to model classifications and predictions in our framework. Data is then distributed to multiple base learners who exchange data via an interaction mechanism to achieve improved prediction. The proposed approach relies on a training model rather than conventional centralized learning. Findings from the experimental evaluations using 20 real-world datasets suggest that Multi-Agent Ensemble outperforms other ensemble approaches in terms of accuracy even though the base learners require fewer samples (i.e., significant reduction in computation costs).


Sign in / Sign up

Export Citation Format

Share Document