Extended Explicit Semantic Analysis for Calculating Semantic Relatedness of Web Resources

Author(s):  
Philipp Scholl ◽  
Doreen Böhnstedt ◽  
Renato Domínguez García ◽  
Christoph Rensing ◽  
Ralf Steinmetz
Author(s):  
Khaoula Mrhar ◽  
Mounia Abik

Explicit Semantic Analysis (ESA) is an approach to measure the semantic relatedness between terms or documents based on similarities to documents of a references corpus usually Wikipedia. ESA usage has received tremendous attention in the field of natural language processing NLP and information retrieval. However, ESA utilizes a huge Wikipedia index matrix in its interpretation by multiplying a large matrix by a term vector to produce a high-dimensional vector. Consequently, the ESA process is too expensive in interpretation and similarity steps. Therefore, the efficiency of ESA will slow down because we lose a lot of time in unnecessary operations. This paper propose enhancements to ESA called optimize-ESA that reduce the dimension at the interpretation stage by computing the semantic similarity in a specific domain. The experimental results show clearly that our method correlates much better with human judgement than the full version ESA approach.


Author(s):  
Patrick Chan ◽  
Yoshinori Hijikata ◽  
Toshiya Kuramochi ◽  
Shogo Nishida

Computing the semantic relatedness between two words or phrases is an important problem in fields such as information retrieval and natural language processing. Explicit Semantic Analysis (ESA), a state-of-the-art approach to solve the problem uses word frequency to estimate relevance. Therefore, the relevance of words with low frequency cannot always be well estimated. To improve the relevance estimate of low-frequency words and concepts, the authors apply regression to word frequency, its location in an article, and its text style to calculate the relevance. The relevance value is subsequently used to compute semantic relatedness. Empirical evaluation shows that, for low-frequency words, the authors’ method achieves better estimate of semantic relatedness over ESA. Furthermore, when all words of the dataset are considered, the combination of the authors’ proposed method and the conventional approach outperforms the conventional approach alone.


2020 ◽  
Author(s):  
Mala Saraswat ◽  
Shampa Chakraverty

Abstract With the advent of e-commerce sites and social media, users express their preferences and tastes freely through user-generated content such as reviews and comments. In order to promote cross-selling, e-commerce sites such as eBay and Amazon regularly use such inputs from multiple domains and suggest items with which users may be interested. In this paper, we propose a topic coherence-based cross-domain recommender model. The core concept is to use topic modeling to extract topics from user-generated content such as reviews and combine them with reliable semantic coherence techniques to link different domains, using Wikipedia as a reference corpus. We experiment with different topic coherence methods such as pointwise mutual information (PMI) and explicit semantic analysis (ESA). Experimental results presented demonstrate that our approach, using PMI as topic coherence, yields 22.6% and using ESA yields 54.4% higher precision as compared with cross-domain recommender system based on semantic clustering.


Sign in / Sign up

Export Citation Format

Share Document