Comparison of Text-Based and Feature-Based Semantic Similarity Between Android Apps

Author(s):  
Md Kafil Uddin ◽  
Qiang He ◽  
Jun Han ◽  
Caslon Chua
2018 ◽  
Vol 24 (6) ◽  
pp. 861-886 ◽  
Author(s):  
ABDULGABBAR SAIF ◽  
UMMI ZAKIAH ZAINODIN ◽  
NAZLIA OMAR ◽  
ABDULLAH SAEED GHAREB

AbstractSemantic measures are used in handling different issues in several research areas, such as artificial intelligence, natural language processing, knowledge engineering, bioinformatics, and information retrieval. Hierarchical feature-based semantic measures have been proposed to estimate the semantic similarity between two concepts/words depending on the features extracted from a semantic taxonomy (hierarchy) of a given lexical source. The central issue in these measures is the constant weighting assumption that all elements in the semantic representation of the concept possess the same relevance. In this paper, a new weighting-based semantic similarity measure is proposed to address the issues in hierarchical feature-based measures. Four mechanisms are introduced to weigh the degree of relevance of features in the semantic representation of a concept by using topological parameters (edge, depth, descendants, and density) in a semantic taxonomy. With the semantic taxonomy of WordNet, the proposed semantic measure is evaluated for word semantic similarity in four gold-standard datasets. Experimental results show that the proposed measure outperforms hierarchical feature-based semantic measures in all the datasets. Comparison results also imply that the proposed measure is more effective than information-content measures in measuring semantic similarity.


2017 ◽  
Vol 1 (1) ◽  
pp. 7-9 ◽  
Author(s):  
Nurul Aswa Omar ◽  
Shahreen Kasim ◽  
Mohd Farhan Md Fudzee ◽  
Azizul Azhar Ramli ◽  
Hairulnizam Mahdin ◽  
...  

Author(s):  
Zhang Xiaodan ◽  
Jing Liping ◽  
Hu Xiaohua ◽  
Ng Michael ◽  
Xia Jiali ◽  
...  

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term similarity measures affect the clustering performance for a certain domain. In this article, we conduct a comparative study on how different term semantic similarity measures including path-based, informationcontent- based and feature-based similarity measure affect document clustering. Term re-weighting of document vector is an important method to integrate domain ontology to clustering process. In detail, the weight of a term is augmented by the weights of its co-occurred concepts. Spherical k-means are used for evaluate document vector re-weighting on two real-world datasets: Disease10 and OHSUMED23. Experimental results on nine different semantic measures have shown that: (1) there is no certain type of similarity measures that significantly outperforms the others; (2) Several similarity measures have rather more stable performance than the others; (3) term re-weighting has positive effects on medical document clustering, but might not be significant when documents are short of terms.


2011 ◽  
pp. 2232-2243
Author(s):  
Xiaodan Zhang ◽  
Liping Jing ◽  
Xiaohua Hu ◽  
Michael Ng ◽  
Jiali Xia ◽  
...  

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term similarity measures affect the clustering performance for a certain domain. In this article, we conduct a comparative study on how different term semantic similarity measures including path-based, information-content- based and feature-based similarity measure affect document clustering. Term re-weighting of document vector is an important method to integrate domain ontology to clustering process. In detail, the weight of a term is augmented by the weights of its co-occurred concepts. Spherical k-means are used for evaluate document vector reweighting on two real-world datasets: Disease10 and OHSUMED23. Experimental results on nine different semantic measures have shown that: (1) there is no certain type of similarity measures that significantly outperforms the others; (2) Several similarity measures have rather more stable performance than the others; (3) term re-weighting has positive effects on medical document clustering, but might not be significant when documents are short of terms.


2020 ◽  
Vol 19 (04) ◽  
pp. 2050033
Author(s):  
Marwah Alian ◽  
Arafat Awajan

Semantic similarity is the task of measuring relations between sentences or words to determine the degree of similarity or resemblance. Several applications of natural language processing require semantic similarity measurement to achieve good results; these applications include plagiarism detection, text entailment, text summarisation, paraphrasing identification, and information extraction. Many researchers have proposed new methods to measure the semantic similarity of Arabic and English texts. In this research, these methods are reviewed and compared. Results show that the precision of the corpus-based approach exceeds 0.70. The precision of the descriptive feature-based technique is between 0.670 and 0.86, with a Pearson correlation coefficient of over 0.70. Meanwhile, the word embedding technique has a correlation of 0.67, and its accuracy is in the range 0.76–0.80. The best results are achieved by the feature-based approach.


2020 ◽  
Vol 32 (7) ◽  
Author(s):  
Shahbaz Hassan Wasti ◽  
Muhammad Jawad Hussain ◽  
Guangjian Huang ◽  
Aftab Akram ◽  
Yuncheng Jiang ◽  
...  

2017 ◽  
Vol 1 (4-2) ◽  
pp. 188
Author(s):  
Shahreen Kasim ◽  
Nurul Aswa Omar ◽  
Nurul Suhaida Mohammad Akbar ◽  
Rohayanti Hassan ◽  
Masrah Azrifah Azmi Murad

Semantic web is an addition of the previous one that represents information more significantly for humans and computers. It enables the description of contents and services in machine readable form. It also enables annotating, discovering, publishing, advertising and composing services to be programmed. Semantic web was developed based on Ontology which is measured as the backbone of the semantic web. Machine-readable is transformed to machine-understandable in the current web. Moreover, Ontology provides a common vocabulary, a grammar for publishing data and can provide a semantic description of data which can be used to conserve the Ontology and keep them ready for implication. There are many that used in feature based in semantic similarity. This research presents a single ontology of X-Similarity feature based method.


2015 ◽  
Vol 51 (3) ◽  
pp. 215-234 ◽  
Author(s):  
Yuncheng Jiang ◽  
Xiaopei Zhang ◽  
Yong Tang ◽  
Ruihua Nie

2014 ◽  
Vol 55 ◽  
pp. 101-113 ◽  
Author(s):  
Albert Solé-Ribalta ◽  
David Sánchez ◽  
Montserrat Batet ◽  
Francesc Serratosa

2012 ◽  
Vol 39 (9) ◽  
pp. 7718-7728 ◽  
Author(s):  
David Sánchez ◽  
Montserrat Batet ◽  
David Isern ◽  
Aida Valls

Sign in / Sign up

Export Citation Format

Share Document