Research on the Calculation Method of Semantic Similarity Based on Concept Hierarchy

Author(s):  
Kai Wang
2012 ◽  
Vol 263-266 ◽  
pp. 1588-1592
Author(s):  
Jiu Qing Li ◽  
Chi Zhang ◽  
Peng Zhou Zhang

To solve resource-tagging inefficiency and low-precision retrieval in special field, an analysis method of tag semantic relevancy based on controlled database was proposed. The characteristic of special field and building method for controlled database were discussed. Domain ontology correlation calculation method was used to get semantic correlation. The tag semantic similarity calculation method was developed for semantic similarity, and normalization was used to increase the similarity accuracy. With semantic correlation and similarity as parameters, the semantic relevancy in special field can be obtained. This method was used successfully in the special field of actual projects, improved resource-tagging and retrieval efficiency.


2014 ◽  
Vol 1049-1050 ◽  
pp. 1514-1517
Author(s):  
Sai Dong Lv ◽  
Ji Li Xie

Subjective question marking system at present is affected by the attention of people, the subjective topic grading principles are common contrast degree of exam questions similar to those of the reference answer, and based on the improved semantic similarity algorithm, calculation of sentence similarity, the similarity degree of exam questions and reference answer is obtained, thus give scores.And design based on semantic similarity experiment, the experiment results show that the proposed multi-level fusion similarity calculation method to improve the original method, on the basis of integration advantages of various methods, make the calculation results meet the requirements of the scoring system.


2013 ◽  
Vol 433-435 ◽  
pp. 1662-1665
Author(s):  
Huan Hai Yang ◽  
Ming Yu Sun

Considering weakness of the traditional retrieval method based on keyword matching, the paper introduced semantic into information retrieval, and proposed a semantic retrieval model based on ontology. The paper offered a construction method of domain ontology and implemented semantic reasoning using Jena and improved a semantic similarity calculation method.


2020 ◽  
Vol 309 ◽  
pp. 03004
Author(s):  
Ying Wang ◽  
Xiwei Feng ◽  
Yue Zhang ◽  
Haiming Chen ◽  
Lijie Xing

This paper explores an improved method for the semantic similarity calculation of words combined with HowNet and CiLin. Firstly, we designing the algorithm based on HowNet’s sememe similarity improvement calculation, comprehensively considering the influence of each part of sememe on the overall meaning, and improving the calculation of word similarity based on HowNet by changing the specific calculation method of each part of sememe. At the same time, we adopt different strategies for the different results obtained in the similarity calculation of CiLin. The experimental RG data set proves that the modified Pearson coefficient of the method reaches 0.87.


Author(s):  
Zhang Xiaodan ◽  
Jing Liping ◽  
Hu Xiaohua ◽  
Ng Michael ◽  
Xia Jiali ◽  
...  

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term similarity measures affect the clustering performance for a certain domain. In this article, we conduct a comparative study on how different term semantic similarity measures including path-based, informationcontent- based and feature-based similarity measure affect document clustering. Term re-weighting of document vector is an important method to integrate domain ontology to clustering process. In detail, the weight of a term is augmented by the weights of its co-occurred concepts. Spherical k-means are used for evaluate document vector re-weighting on two real-world datasets: Disease10 and OHSUMED23. Experimental results on nine different semantic measures have shown that: (1) there is no certain type of similarity measures that significantly outperforms the others; (2) Several similarity measures have rather more stable performance than the others; (3) term re-weighting has positive effects on medical document clustering, but might not be significant when documents are short of terms.


2011 ◽  
pp. 2232-2243
Author(s):  
Xiaodan Zhang ◽  
Liping Jing ◽  
Xiaohua Hu ◽  
Michael Ng ◽  
Jiali Xia ◽  
...  

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term similarity measures affect the clustering performance for a certain domain. In this article, we conduct a comparative study on how different term semantic similarity measures including path-based, information-content- based and feature-based similarity measure affect document clustering. Term re-weighting of document vector is an important method to integrate domain ontology to clustering process. In detail, the weight of a term is augmented by the weights of its co-occurred concepts. Spherical k-means are used for evaluate document vector reweighting on two real-world datasets: Disease10 and OHSUMED23. Experimental results on nine different semantic measures have shown that: (1) there is no certain type of similarity measures that significantly outperforms the others; (2) Several similarity measures have rather more stable performance than the others; (3) term re-weighting has positive effects on medical document clustering, but might not be significant when documents are short of terms.


2014 ◽  
Vol 20 (1) ◽  
pp. 33-46 ◽  
Author(s):  
Mye Sohn ◽  
Jun Hyeok Yim ◽  
Seongil Lee ◽  
Hyun Jung Lee

Sign in / Sign up

Export Citation Format

Share Document