Comparison of Text-Based and Feature-Based Semantic Similarity Between Android Apps

AbstractSemantic measures are used in handling different issues in several research areas, such as artificial intelligence, natural language processing, knowledge engineering, bioinformatics, and information retrieval. Hierarchical feature-based semantic measures have been proposed to estimate the semantic similarity between two concepts/words depending on the features extracted from a semantic taxonomy (hierarchy) of a given lexical source. The central issue in these measures is the constant weighting assumption that all elements in the semantic representation of the concept possess the same relevance. In this paper, a new weighting-based semantic similarity measure is proposed to address the issues in hierarchical feature-based measures. Four mechanisms are introduced to weigh the degree of relevance of features in the semantic representation of a concept by using topological parameters (edge, depth, descendants, and density) in a semantic taxonomy. With the semantic taxonomy of WordNet, the proposed semantic measure is evaluated for word semantic similarity in four gold-standard datasets. Experimental results show that the proposed measure outperforms hierarchical feature-based semantic measures in all the datasets. Comparison results also imply that the proposed measure is more effective than information-content measures in measuring semantic similarity.

Download Full-text

A REVIEW ON FEATURE BASED APPROACH IN SEMANTIC SIMILARITY FOR MULTIPLE ONTOLOGY

Acta Informatica Malaysia ◽

10.26480/aim.01.2017.07.09 ◽

2017 ◽

Vol 1 (1) ◽

pp. 7-9 ◽

Cited By ~ 2

Author(s):

Nurul Aswa Omar ◽

Shahreen Kasim ◽

Mohd Farhan Md Fudzee ◽

Azizul Azhar Ramli ◽

Hairulnizam Mahdin ◽

...

Keyword(s):

Semantic Similarity ◽

Feature Based

Download Full-text

Medical Document Clustering Using Ontology-Based Term Similarity Measures

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies ◽

10.4018/978-1-60566-717-1.ch007 ◽

2011 ◽

pp. 121-132

Author(s):

Zhang Xiaodan ◽

Jing Liping ◽

Hu Xiaohua ◽

Ng Michael ◽

Xia Jiali ◽

...

Keyword(s):

Semantic Similarity ◽

Domain Knowledge ◽

Document Clustering ◽

Similarity Measures ◽

Concept Hierarchy ◽

Term Similarity ◽

Feature Based ◽

Document Vector ◽

Real World Datasets ◽

Medical Document

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term similarity measures affect the clustering performance for a certain domain. In this article, we conduct a comparative study on how different term semantic similarity measures including path-based, informationcontent- based and feature-based similarity measure affect document clustering. Term re-weighting of document vector is an important method to integrate domain ontology to clustering process. In detail, the weight of a term is augmented by the weights of its co-occurred concepts. Spherical k-means are used for evaluate document vector re-weighting on two real-world datasets: Disease10 and OHSUMED23. Experimental results on nine different semantic measures have shown that: (1) there is no certain type of similarity measures that significantly outperforms the others; (2) Several similarity measures have rather more stable performance than the others; (3) term re-weighting has positive effects on medical document clustering, but might not be significant when documents are short of terms.

Download Full-text

Medical Document Clustering Using Ontology-Based Term Similarity Measures

Medical Informatics ◽

10.4018/978-1-60566-050-9.ch169 ◽

2011 ◽

pp. 2232-2243

Author(s):

Xiaodan Zhang ◽

Liping Jing ◽

Xiaohua Hu ◽

Michael Ng ◽

Jiali Xia ◽

...

Keyword(s):

Semantic Similarity ◽

Domain Knowledge ◽

Document Clustering ◽

Similarity Measures ◽

Concept Hierarchy ◽

Term Similarity ◽

Feature Based ◽

Document Vector ◽

Real World Datasets ◽

Medical Document

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term similarity measures affect the clustering performance for a certain domain. In this article, we conduct a comparative study on how different term semantic similarity measures including path-based, information-content- based and feature-based similarity measure affect document clustering. Term re-weighting of document vector is an important method to integrate domain ontology to clustering process. In detail, the weight of a term is augmented by the weights of its co-occurred concepts. Spherical k-means are used for evaluate document vector reweighting on two real-world datasets: Disease10 and OHSUMED23. Experimental results on nine different semantic measures have shown that: (1) there is no certain type of similarity measures that significantly outperforms the others; (2) Several similarity measures have rather more stable performance than the others; (3) term re-weighting has positive effects on medical document clustering, but might not be significant when documents are short of terms.

Download Full-text

Semantic Similarity for English and Arabic Texts: A Review

Journal of Information & Knowledge Management ◽

10.1142/s0219649220500331 ◽

2020 ◽

Vol 19 (04) ◽

pp. 2050033

Author(s):

Marwah Alian ◽

Arafat Awajan

Keyword(s):

Semantic Similarity ◽

Language Processing ◽

Pearson Correlation ◽

Plagiarism Detection ◽

Embedding Technique ◽

Semantic Similarity Measurement ◽

Feature Based ◽

Arabic And English ◽

Descriptive Feature ◽

Degree Of Similarity

Semantic similarity is the task of measuring relations between sentences or words to determine the degree of similarity or resemblance. Several applications of natural language processing require semantic similarity measurement to achieve good results; these applications include plagiarism detection, text entailment, text summarisation, paraphrasing identification, and information extraction. Many researchers have proposed new methods to measure the semantic similarity of Arabic and English texts. In this research, these methods are reviewed and compared. Results show that the precision of the corpus-based approach exceeds 0.70. The precision of the descriptive feature-based technique is between 0.670 and 0.86, with a Pearson correlation coefficient of over 0.70. Meanwhile, the word embedding technique has a correlation of 0.67, and its accuracy is in the range 0.76–0.80. The best results are achieved by the feature-based approach.

Download Full-text

Assessing semantic similarity between concepts: A weighted‐feature‐based approach

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5594 ◽

2020 ◽

Vol 32 (7) ◽

Author(s):

Shahbaz Hassan Wasti ◽

Muhammad Jawad Hussain ◽

Guangjian Huang ◽

Aftab Akram ◽

Yuncheng Jiang ◽

...

Keyword(s):

Semantic Similarity ◽

Feature Based

Download Full-text

X-Similarity Comparison by using Wordnet

JOIV International Journal on Informatics Visualization ◽

10.30630/joiv.1.4-2.79 ◽

2017 ◽

Vol 1 (4-2) ◽

pp. 188

Author(s):

Shahreen Kasim ◽

Nurul Aswa Omar ◽

Nurul Suhaida Mohammad Akbar ◽

Rohayanti Hassan ◽

Masrah Azrifah Azmi Murad

Keyword(s):

Semantic Web ◽

Semantic Similarity ◽

Semantic Description ◽

Similarity Comparison ◽

Readable Form ◽

Feature Based ◽

Machine Readable ◽

Machine Readable Form

Semantic web is an addition of the previous one that represents information more significantly for humans and computers. It enables the description of contents and services in machine readable form. It also enables annotating, discovering, publishing, advertising and composing services to be programmed. Semantic web was developed based on Ontology which is measured as the backbone of the semantic web. Machine-readable is transformed to machine-understandable in the current web. Moreover, Ontology provides a common vocabulary, a grammar for publishing data and can provide a semantic description of data which can be used to conserve the Ontology and keep them ready for implication. There are many that used in feature based in semantic similarity. This research presents a single ontology of X-Similarity feature based method.

Download Full-text