Research on the Calculation Method of Semantic Similarity Based on Concept Hierarchy

An Analysis Method of Semantic Relevancy on Controlled Tag

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.263-266.1588 ◽

2012 ◽

Vol 263-266 ◽

pp. 1588-1592

Author(s):

Jiu Qing Li ◽

Chi Zhang ◽

Peng Zhou Zhang

Keyword(s):

Semantic Similarity ◽

Calculation Method ◽

Domain Ontology ◽

Analysis Method ◽

Semantic Correlation ◽

Retrieval Efficiency ◽

Similarity Calculation

To solve resource-tagging inefficiency and low-precision retrieval in special field, an analysis method of tag semantic relevancy based on controlled database was proposed. The characteristic of special field and building method for controlled database were discussed. Domain ontology correlation calculation method was used to get semantic correlation. The tag semantic similarity calculation method was developed for semantic similarity, and normalization was used to increase the similarity accuracy. With semantic correlation and similarity as parameters, the semantic relevancy in special field can be obtained. This method was used successfully in the special field of actual projects, improved resource-tagging and retrieval efficiency.

Download Full-text

Research on the Subjective Topic Grading Agorithm Based on the Computer Course

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1049-1050.1514 ◽

2014 ◽

Vol 1049-1050 ◽

pp. 1514-1517

Author(s):

Sai Dong Lv ◽

Ji Li Xie

Keyword(s):

Semantic Similarity ◽

Calculation Method ◽

Original Method ◽

Sentence Similarity ◽

Similarity Calculation ◽

Similarity Algorithm ◽

Computer Course ◽

Calculation Results ◽

Multi Level ◽

Level Fusion

Subjective question marking system at present is affected by the attention of people, the subjective topic grading principles are common contrast degree of exam questions similar to those of the reference answer, and based on the improved semantic similarity algorithm, calculation of sentence similarity, the similarity degree of exam questions and reference answer is obtained, thus give scores.And design based on semantic similarity experiment, the experiment results show that the proposed multi-level fusion similarity calculation method to improve the original method, on the basis of integration advantages of various methods, make the calculation results meet the requirements of the scoring system.

Download Full-text

Study on Application of Domain Ontology in Semantic Information Retrieval

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.433-435.1662 ◽

2013 ◽

Vol 433-435 ◽

pp. 1662-1665

Author(s):

Huan Hai Yang ◽

Ming Yu Sun

Keyword(s):

Information Retrieval ◽

Semantic Similarity ◽

Calculation Method ◽

Semantic Information ◽

Domain Ontology ◽

Semantic Retrieval ◽

Retrieval Model ◽

Retrieval Method ◽

Similarity Calculation ◽

Semantic Information Retrieval

Considering weakness of the traditional retrieval method based on keyword matching, the paper introduced semantic into information retrieval, and proposed a semantic retrieval model based on ontology. The paper offered a construction method of domain ontology and implemented semantic reasoning using Jena and improved a semantic similarity calculation method.

Download Full-text

An improved semantic similarity algorithm based on HowNet and CiLin

MATEC Web of Conferences ◽

10.1051/matecconf/202030903004 ◽

2020 ◽

Vol 309 ◽

pp. 03004

Author(s):

Ying Wang ◽

Xiwei Feng ◽

Yue Zhang ◽

Haiming Chen ◽

Lijie Xing

Keyword(s):

Semantic Similarity ◽

Calculation Method ◽

Improved Method ◽

Data Set ◽

Specific Calculation ◽

Word Similarity ◽

Pearson Coefficient ◽

Similarity Calculation ◽

Similarity Algorithm

This paper explores an improved method for the semantic similarity calculation of words combined with HowNet and CiLin. Firstly, we designing the algorithm based on HowNet’s sememe similarity improvement calculation, comprehensively considering the influence of each part of sememe on the overall meaning, and improving the calculation of word similarity based on HowNet by changing the specific calculation method of each part of sememe. At the same time, we adopt different strategies for the different results obtained in the similarity calculation of CiLin. The experimental RG data set proves that the modified Pearson coefficient of the method reaches 0.87.

Download Full-text

Medical Document Clustering Using Ontology-Based Term Similarity Measures

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies ◽

10.4018/978-1-60566-717-1.ch007 ◽

2011 ◽

pp. 121-132

Author(s):

Zhang Xiaodan ◽

Jing Liping ◽

Hu Xiaohua ◽

Ng Michael ◽

Xia Jiali ◽

...

Keyword(s):

Semantic Similarity ◽

Domain Knowledge ◽

Document Clustering ◽

Similarity Measures ◽

Concept Hierarchy ◽

Term Similarity ◽

Feature Based ◽

Document Vector ◽

Real World Datasets ◽

Medical Document

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term similarity measures affect the clustering performance for a certain domain. In this article, we conduct a comparative study on how different term semantic similarity measures including path-based, informationcontent- based and feature-based similarity measure affect document clustering. Term re-weighting of document vector is an important method to integrate domain ontology to clustering process. In detail, the weight of a term is augmented by the weights of its co-occurred concepts. Spherical k-means are used for evaluate document vector re-weighting on two real-world datasets: Disease10 and OHSUMED23. Experimental results on nine different semantic measures have shown that: (1) there is no certain type of similarity measures that significantly outperforms the others; (2) Several similarity measures have rather more stable performance than the others; (3) term re-weighting has positive effects on medical document clustering, but might not be significant when documents are short of terms.

Download Full-text

Medical Document Clustering Using Ontology-Based Term Similarity Measures

Medical Informatics ◽

10.4018/978-1-60566-050-9.ch169 ◽

2011 ◽

pp. 2232-2243

Author(s):

Xiaodan Zhang ◽

Liping Jing ◽

Xiaohua Hu ◽

Michael Ng ◽

Jiali Xia ◽

...

Keyword(s):

Semantic Similarity ◽

Domain Knowledge ◽

Document Clustering ◽

Similarity Measures ◽

Concept Hierarchy ◽

Term Similarity ◽

Feature Based ◽

Document Vector ◽

Real World Datasets ◽

Medical Document

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term similarity measures affect the clustering performance for a certain domain. In this article, we conduct a comparative study on how different term semantic similarity measures including path-based, information-content- based and feature-based similarity measure affect document clustering. Term re-weighting of document vector is an important method to integrate domain ontology to clustering process. In detail, the weight of a term is augmented by the weights of its co-occurred concepts. Spherical k-means are used for evaluate document vector reweighting on two real-world datasets: Disease10 and OHSUMED23. Experimental results on nine different semantic measures have shown that: (1) there is no certain type of similarity measures that significantly outperforms the others; (2) Several similarity measures have rather more stable performance than the others; (3) term re-weighting has positive effects on medical document clustering, but might not be significant when documents are short of terms.

Download Full-text