Clustering Approaches for Top-k Recommender Systems

2019 ◽  
Vol 28 (05) ◽  
pp. 1950019 ◽  
Author(s):  
Nicolás Torres ◽  
Marcelo Mendoza

Clustering-based recommender systems bound the seek of similar users within small user clusters providing fast recommendations in large-scale datasets. Then groups can naturally be distributed into different data partitions scaling up in the number of users the recommender system can handle. Unfortunately, while the number of users and items included in a cluster solution increases, the performance in terms of precision of a clustering-based recommender system decreases. We present a novel approach that introduces a cluster-based distance function used for neighborhood computation. In our approach, clusters generated from the training data provide the basis for neighborhood selection. Then, to expand the search of relevant users, we use a novel measure that can exploit the global cluster structure to infer cluster-outside user’s distances. Empirical studies on five widely known benchmark datasets show that our proposal is very competitive in terms of precision, recall, and NDCG. However, the strongest point of our method relies on scalability, reaching speedups of 20× in a sequential computing evaluation framework and up to 100× in a parallel architecture. These results show that an efficient implementation of our cluster-based CF method can handle very large datasets providing also good results in terms of precision, avoiding the high computational costs involved in the application of more sophisticated techniques.

Author(s):  
Jun Huang ◽  
Linchuan Xu ◽  
Jing Wang ◽  
Lei Feng ◽  
Kenji Yamanishi

Existing multi-label learning (MLL) approaches mainly assume all the labels are observed and construct classification models with a fixed set of target labels (known labels). However, in some real applications, multiple latent labels may exist outside this set and hide in the data, especially for large-scale data sets. Discovering and exploring the latent labels hidden in the data may not only find interesting knowledge but also help us to build a more robust learning model. In this paper, a novel approach named DLCL (i.e., Discovering Latent Class Labels for MLL) is proposed which can not only discover the latent labels in the training data but also predict new instances with the latent and known labels simultaneously. Extensive experiments show a competitive performance of DLCL against other state-of-the-art MLL approaches.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-15 ◽  
Author(s):  
Hanwen Liu ◽  
Huaizhen Kou ◽  
Chao Yan ◽  
Lianyong Qi

Nowadays, scholar recommender systems often recommend academic papers based on users’ personalized retrieval demands. Typically, a recommender system analyzes the keywords typed by a user and then returns his or her preferred papers, in an efficient and economic manner. In practice, one paper often contains partial keywords that a user is interested in. Therefore, the recommender system needs to return the user a set of papers that collectively covers all the queried keywords. However, existing recommender systems only use the exact keyword matching technique for recommendation decisions, while neglecting the correlation relationships among different papers. As a consequence, it may output a set of papers from multiple disciplines that are different from the user’s real research field. In view of this shortcoming, we propose a keyword-driven and popularity-aware paper recommendation approach based on an undirected paper citation graph, named PRkeyword+pop. At last, we conduct large-scale experiments on the real-life Hep-Th dataset to further demonstrate the usefulness and feasibility of PRkeyword+pop. Experimental results prove the advantages of PRkeyword+pop in searching for a set of satisfactory papers compared with other competitive approaches.


2020 ◽  
Vol 34 (03) ◽  
pp. 2677-2684
Author(s):  
Marjaneh Safaei ◽  
Pooyan Balouchian ◽  
Hassan Foroosh

Action recognition in still images poses a great challenge due to (i) fewer available training data, (ii) absence of temporal information. To address the first challenge, we introduce a dataset for STill image Action Recognition (STAR), containing over $1M$ images across 50 different human body-motion action categories. UCF-STAR is the largest dataset in the literature for action recognition in still images. The key characteristics of UCF-STAR include (1) focusing on human body-motion rather than relatively static human-object interaction categories, (2) collecting images from the wild to benefit from a varied set of action representations, (3) appending multiple human-annotated labels per image rather than just the action label, and (4) inclusion of rich, structured and multi-modal set of metadata for each image. This departs from existing datasets, which typically provide single annotation in a smaller number of images and categories, with no metadata. UCF-STAR exposes the intrinsic difficulty of action recognition through its realistic scene and action complexity. To benchmark and demonstrate the benefits of UCF-STAR as a large-scale dataset, and to show the role of “latent” motion information in recognizing human actions in still images, we present a novel approach relying on predicting temporal information, yielding higher accuracy on 5 widely-used datasets.


2021 ◽  
Vol 11 (4) ◽  
pp. 1733
Author(s):  
Yuseok Ban ◽  
Kyungjae Lee

Many studies have been conducted on recommender systems in both the academic and industrial fields, as they are currently broadly used in various digital platforms to make personalized suggestions. Despite the improvement in the accuracy of recommenders, the diversity of interest areas recommended to a user tends to be reduced, and the sparsity of explicit feedback from users has been an important issue for making progress in recommender systems. In this paper, we introduce a novel approach, namely re-enrichment learning, which effectively leverages the implicit logged feedback from users to enhance user retention in a platform by enriching their interest areas. The approach consists of (i) graph-based domain transfer and (ii) metadata saliency, which (i) find an adaptive and collaborative domain representing the relations among many users’ metadata and (ii) extract attentional features from a user’s implicit logged feedback, respectively. The experimental results show that our proposed approach has a better capacity to enrich the diversity of interests of a user by means of implicit feedback and to help recommender systems achieve more balanced personalization. Our approach, finally, helps recommenders improve user retention, i.e., encouraging users to click more items or dwell longer on the platform.


Author(s):  
Jipeng Zhang ◽  
Roy Ka-Wei Lee ◽  
Ee-Peng Lim ◽  
Wei Qin ◽  
Lei Wang ◽  
...  

Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, this paper proposes a novel approach, TSN-MD, by leveraging the teacher network to integrate the knowledge of equivalent solution expressions and then to regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by a large margin. More intriguingly, the visualization results demonstrate that TSN-MD not only produces correct final answers but also generates diverse equivalent expressions of the solution.


2019 ◽  
Vol 44 (2) ◽  
pp. 399-416
Author(s):  
Yoke Yie Chen ◽  
Nirmalie Wiratunga ◽  
Robert Lothian

Purpose Recommender system approaches such as collaborative and content-based filtering rely on user ratings and product descriptions to recommend products. More recently, recommender system research has focussed on exploiting knowledge from user-generated content such as product reviews to enhance recommendation performance. The purpose of this paper is to show that the performance of a recommender system can be enhanced by integrating explicit knowledge extracted from product reviews with implicit knowledge extracted from analysis of consumer’s purchase behaviour. Design/methodology/approach The authors introduce a sentiment and preference-guided strategy for product recommendation by integrating not only explicit, user-generated and sentiment-rich content but also implicit knowledge gleaned from users’ product purchase preferences. Integration of both of these knowledge sources helps to model sentiment over a set of product aspects. The authors show how established dimensionality reduction and feature weighting approaches from text classification can be adopted to weight and select an optimal subset of aspects for recommendation tasks. The authors compare the proposed approach against several baseline methods as well as the state-of-the-art better method, which recommends products that are superior to a query product. Findings Evaluation results from seven different product categories show that aspect weighting and selection significantly improves state-of-the-art recommendation approaches. Research limitations/implications The proposed approach recommends products by analysing user sentiment on product aspects. Therefore, the proposed approach can be used to develop recommender systems that can explain to users why a product is recommended. This is achieved by presenting an analysis of sentiment distribution over individual aspects that describe a given product. Originality/value This paper describes a novel approach to integrate consumer purchase behaviour analysis and aspect-level sentiment analysis to enhance recommendation. In particular, the authors introduce the idea of aspect weighting and selection to help users identify better products. Furthermore, the authors demonstrate the practical benefits of this approach on a variety of product categories and compare the approach with the current state-of-the-art approaches.


2018 ◽  
Vol 7 (4.33) ◽  
pp. 5
Author(s):  
S. Masrom ◽  
N. Khairuddin ◽  
A. Abdul Rahman ◽  
A. Azizan ◽  
A. S.A. Rahman

To date, there exists a variety of prediction approaches have been used in recommender systems. Among the widely known approaches are Content Based Filtering (CBF) and Collaborative Filtering (CF). Based on literatures, CF with users rating element has been widely used but the approach faced two common problems namely cold start and sparsity. As an alternative, Trust Aware Recommender Systems (TARS) for the CF based users rating has been introduced.  The research progress on TARS improvement is found to be rapidly progressing but lacking in the algorithm evaluation has been started to appear. Many researchers that introduced their new TARS approach provides different evaluation of users’ views for the TARS performances. As a result, the performances of different TARS from different publications are not comparable and difficult to be analyzed. Therefore, this paper is written with objective to provide common group of the users’ views based on trusted users in TARS. Then, this paper demonstrates a comparison study between different TARS techniques with the identified common groups by means of the accuracy error, rating and users coverage. The results therefore provide a relative comparison between different TARS. 


2020 ◽  
Vol 34 (05) ◽  
pp. 9177-9184
Author(s):  
Jiancheng Wang ◽  
Jingjing Wang ◽  
Changlong Sun ◽  
Shoushan Li ◽  
Xiaozhong Liu ◽  
...  

Sentiment analysis in dialogues plays a critical role in dialogue data analysis. However, previous studies on sentiment classification in dialogues largely ignore topic information, which is important for capturing overall information in some types of dialogues. In this study, we focus on the sentiment classification task in an important type of dialogue, namely customer service dialogue, and propose a novel approach which captures overall information to enhance the classification performance. Specifically, we propose a topic-aware multi-task learning (TML) approach which learns topic-enriched utterance representations in customer service dialogue by capturing various kinds of topic information. In the experiment, we propose a large-scale and high-quality annotated corpus for the sentiment classification task in customer service dialogue and empirical studies on the proposed corpus show that our approach significantly outperforms several strong baselines.


Author(s):  
Fakhri G Abbas ◽  
Nadia Najjar ◽  
David Wilson

Conversational recommender systems help to guide users in exploring the search space in order to discover items of interest. During the exploration process, the user provides feedback on recommended items to refine subsequent recommendations. Critiquing as a way of feedback has proven effective for conversational interactions. In addition, diversifying the recommended items during exploration can help to increase user understanding of the search space, which critiquing alone will not achieve. Both aspects are important elements for recommender applications in the food domain. Diversity in diet has been shown to predict nutritional health, and conversational exploration can help to introduce new food items. In this paper, we introduce a novel approach that brings together critique and diversity to support conversational recommendation in the recipe domain. Initial evaluation in comparison to a baseline similarity-based recommender shows that the proposed approach increases diversity during the exploration process.


Author(s):  
Yijian Chuan ◽  
Chaoyi Zhao ◽  
Zhenrui He ◽  
Lan Wu

We develop a novel approach to explain why AdaBoost is a successful classifier. By introducing a measure of the influence of the noise points (ION) in the training data for the binary classification problem, we prove that there is a strong connection between the ION and the test error. We further identify that the ION of AdaBoost decreases as the iteration number or the complexity of the base learners increases. We confirm that it is impossible to obtain a consistent classifier without deep trees as the base learners of AdaBoost in some complicated situations. We apply AdaBoost in portfolio management via empirical studies in the Chinese market, which corroborates our theoretical propositions.


Sign in / Sign up

Export Citation Format

Share Document