An Improved Method for Semantic Similarity Calculation Based on Stop-Words

Author(s):  
Haodi Li ◽  
Qingcai Chen ◽  
Xiaolong Wang
2020 ◽  
Vol 309 ◽  
pp. 03004
Author(s):  
Ying Wang ◽  
Xiwei Feng ◽  
Yue Zhang ◽  
Haiming Chen ◽  
Lijie Xing

This paper explores an improved method for the semantic similarity calculation of words combined with HowNet and CiLin. Firstly, we designing the algorithm based on HowNet’s sememe similarity improvement calculation, comprehensively considering the influence of each part of sememe on the overall meaning, and improving the calculation of word similarity based on HowNet by changing the specific calculation method of each part of sememe. At the same time, we adopt different strategies for the different results obtained in the similarity calculation of CiLin. The experimental RG data set proves that the modified Pearson coefficient of the method reaches 0.87.


2012 ◽  
Vol 263-266 ◽  
pp. 1588-1592
Author(s):  
Jiu Qing Li ◽  
Chi Zhang ◽  
Peng Zhou Zhang

To solve resource-tagging inefficiency and low-precision retrieval in special field, an analysis method of tag semantic relevancy based on controlled database was proposed. The characteristic of special field and building method for controlled database were discussed. Domain ontology correlation calculation method was used to get semantic correlation. The tag semantic similarity calculation method was developed for semantic similarity, and normalization was used to increase the similarity accuracy. With semantic correlation and similarity as parameters, the semantic relevancy in special field can be obtained. This method was used successfully in the special field of actual projects, improved resource-tagging and retrieval efficiency.


2010 ◽  
Vol 5 (7) ◽  
pp. 17-23 ◽  
Author(s):  
Cheng Xianyi ◽  
Sun Ping ◽  
Zhu Qian ◽  
Cai Yuehong

Informatics ◽  
2019 ◽  
Vol 6 (2) ◽  
pp. 19 ◽  
Author(s):  
Rajat Pandit ◽  
Saptarshi Sengupta ◽  
Sudip Kumar Naskar ◽  
Niladri Sekhar Dash ◽  
Mohini Mohan Sardar

Semantic similarity is a long-standing problem in natural language processing (NLP). It is a topic of great interest as its understanding can provide a look into how human beings comprehend meaning and make associations between words. However, when this problem is looked at from the viewpoint of machine understanding, particularly for under resourced languages, it poses a different problem altogether. In this paper, semantic similarity is explored in Bangla, a less resourced language. For ameliorating the situation in such languages, the most rudimentary method (path-based) and the latest state-of-the-art method (Word2Vec) for semantic similarity calculation were augmented using cross-lingual resources in English and the results obtained are truly astonishing. In the presented paper, two semantic similarity approaches have been explored in Bangla, namely the path-based and distributional model and their cross-lingual counterparts were synthesized in light of the English WordNet and Corpora. The proposed methods were evaluated on a dataset comprising of 162 Bangla word pairs, which were annotated by five expert raters. The correlation scores obtained between the four metrics and human evaluation scores demonstrate a marked enhancement that the cross-lingual approach brings into the process of semantic similarity calculation for Bangla.


2014 ◽  
Vol 1049-1050 ◽  
pp. 1514-1517
Author(s):  
Sai Dong Lv ◽  
Ji Li Xie

Subjective question marking system at present is affected by the attention of people, the subjective topic grading principles are common contrast degree of exam questions similar to those of the reference answer, and based on the improved semantic similarity algorithm, calculation of sentence similarity, the similarity degree of exam questions and reference answer is obtained, thus give scores.And design based on semantic similarity experiment, the experiment results show that the proposed multi-level fusion similarity calculation method to improve the original method, on the basis of integration advantages of various methods, make the calculation results meet the requirements of the scoring system.


2014 ◽  
Vol 668-669 ◽  
pp. 1237-1242
Author(s):  
Wen Ming Guo ◽  
Liang Sun

In the light of the data differences between network television and the Internet, this paper solve the problem of grading IPTV by the introduction of time context information and computing the latent scores based on the traditional and item-based collaborative filtering recommendation algorithm. Construct the user - item, the item - time model and optimize item similarity calculation so as to ease the difficulty of searching the similar item due to the data scarcity. The experimental results show that the improved method can obviously increase the recommendation precision and has a certain effect on reducing the impact of data scarcity compared with the traditional item-based collaborative filtering.


Sign in / Sign up

Export Citation Format

Share Document