EMD Based Semantic User Similarity using Past Travel Histories

2022 ◽  
Vol 24 (3) ◽  
pp. 0-0

The cost-effective and easy availability of handheld mobile devices and ubiquity of location acquisition services such as GPS and GSM networks has helped expedient logging and sharing of location histories of mobile users. This work aims to find semantic user similarity using their past travel histories. Application of the semantic similarity measure can be found in tourism-related recommender systems and information retrieval. The paper presents Earth Mover’s Distance (EMD) based semantic user similarity measure using users' GPS logs. The similarity measure is applied and evaluated on the GPS dataset of 182 users collected from April 2007 to August 2012 by Microsoft's GeoLife project. The proposed similarity measure is compared with conventional similarity measures used in literature such as Jaccard, Dice, and Pearsons’ Correlation. The percentage improvement of EMD based approach over existing approaches in terms of average RMSE is 10.70%, and average MAE is 5.73%.

2022 ◽  
Vol 24 (3) ◽  
pp. 1-17
Author(s):  
Sunita Tiwari ◽  
Saroj Kaushik

The cost-effective and easy availability of handheld mobile devices and ubiquity of location acquisition services such as GPS and GSM networks has helped expedient logging and sharing of location histories of mobile users. This work aims to find semantic user similarity using their past travel histories. Application of the semantic similarity measure can be found in tourism-related recommender systems and information retrieval. The paper presents Earth Mover’s Distance (EMD) based semantic user similarity measure using users' GPS logs. The similarity measure is applied and evaluated on the GPS dataset of 182 users collected from April 2007 to August 2012 by Microsoft's GeoLife project. The proposed similarity measure is compared with conventional similarity measures used in literature such as Jaccard, Dice, and Pearsons’ Correlation. The percentage improvement of EMD based approach over existing approaches in terms of average RMSE is 10.70%, and average MAE is 5.73%.


2019 ◽  
Vol 8 (3) ◽  
pp. 6756-6762

A recommendation algorithm comprises of two important steps: 1) Predicting rates, and 2) Recommendation. Rate prediction is a cumulative function of the similarity score between two movies and rate history of those movies by other users. There are various methods for rate prediction such as weighted sum method, regression, deviation based etc. All these methods rely on finding similar items to the items previously viewed/rated by target user, with assumption that user tends to have similar rating for similar items. Computing the similarities can be done using various similarity measures such as Euclidian Distance, Cosine Similarity, Adjusted Cosine Similarity, Pearson Correlation, Jaccard Similarity etc. All of these well-known approaches calculate similarity score between two movies using simple rating based data. Hence, such similarity measures could not accurately model rating behavior of user. In this paper, we will show that the accuracy in rate prediction can be enhanced by incorporating ontological domain knowledge in similarity computation. This paper introduces a new ontological semantic similarity measure between two movies. For experimental evaluation, the performance of proposed approach is compared with two existing approaches: 1) Adjusted Cosine Similarity (ACS), and 2) Weighted Slope One (WSO) algorithm, in terms of two performance measures: 1) Execution time and 2) Mean Absolute Error (MAE). The open-source Movielens (ml-1m) dataset is used for experimental evaluation. As our results show, the ontological semantic similarity measure enhances the performance of rate prediction as compared to the existing-well known approaches.


Author(s):  
Mourad Oussalah ◽  
Muhidin Mohamed

AbstractDetermining the extent to which two text snippets are semantically equivalent is a well-researched topic in the areas of natural language processing, information retrieval and text summarization. The sentence-to-sentence similarity scoring is extensively used in both generic and query-based summarization of documents as a significance or a similarity indicator. Nevertheless, most of these applications utilize the concept of semantic similarity measure only as a tool, without paying importance to the inherent properties of such tools that ultimately restrict the scope and technical soundness of the underlined applications. This paper aims to contribute to fill in this gap. It investigates three popular WordNet hierarchical semantic similarity measures, namely path-length, Wu and Palmer and Leacock and Chodorow, from both algebraical and intuitive properties, highlighting their inherent limitations and theoretical constraints. We have especially examined properties related to range and scope of the semantic similarity score, incremental monotonicity evolution, monotonicity with respect to hyponymy/hypernymy relationship as well as a set of interactive properties. Extension from word semantic similarity to sentence similarity has also been investigated using a pairwise canonical extension. Properties of the underlined sentence-to-sentence similarity are examined and scrutinized. Next, to overcome inherent limitations of WordNet semantic similarity in terms of accounting for various Part-of-Speech word categories, a WordNet “All word-To-Noun conversion” that makes use of Categorial Variation Database (CatVar) is put forward and evaluated using a publicly available dataset with a comparison with some state-of-the-art methods. The finding demonstrates the feasibility of the proposal and opens up new opportunities in information retrieval and natural language processing tasks.


2012 ◽  
Vol 38 (2) ◽  
pp. 229-235 ◽  
Author(s):  
Wen-Qing LI ◽  
Xin SUN ◽  
Chang-You ZHANG ◽  
Ye FENG

Sign in / Sign up

Export Citation Format

Share Document