A Document Similarity Computation Method Based on Word Embedding and Citation Analysis

Author(s):  
K. Lamiya ◽  
Anuraj Mohan
Author(s):  
Sonia Alouane-Ksouri ◽  
Minyar Sassi Hidri

The contribution of this work relates to the field of Arabic text-based document analysis for the detection of plagiarism. This analysis will be carried out according to the triadic computation model of document similarity. The authors propose a hybrid segmentation prototype for Arabic text-based documents that links different processing steps in order to generate the similarity rate between the documents of an Arabic corpus. It involves two segmentation systems and a morphological analysis in order to obtain a matrix representation adapted to the triadic similarity computation according to three abstraction levels: documents, sentences and words.


2013 ◽  
Vol 347-350 ◽  
pp. 3287-3291
Author(s):  
Yun Xia Wang ◽  
Zhi Liang Wang ◽  
Cheng Chong Gao

To realize cloud manufacturing (CMfg) production in group enterprises, manufacturing resources and modeling technologies of cloud pool were studied. According to the characteristics of group enterprises, manufacturing resources were analyzed and classified into human, equipment, materials, cooperation resources and so on. Then, the realization method which manufacturing resources mapped into virtual resources was researched, and a layer platform for cloud manufacturing was proposed. Taking CNC machine tool as an example, the ontology model was built with Semantic Web and OWL based on ontology theory. Finally, using semantic similarity computation method and case-based reasoning, the virtual resources were intelligent searched and matched so that manufacturing resources can realize unification, sharing and reuse.


Author(s):  
Hongtao Huang ◽  
Cunliang Liang ◽  
Haizhi Ye

Probability information content-based FCA concepts similarity computation method relies on the frequency of concepts in corpus, it takes only the occurrence probability as information content metric to compute FCA concept similarity, which leads to lower accuracy. This article introduces a semantic information content-based method for FCA concept similarity evaluation, in addition to the occurrence probability, it takes the superordinate and subordinate semantic relationship of concepts to measure information content, which makes the generic and specific degree of concepts more accurate. Then the semantic information content similarity can be calculated with the help of an ISA hierarchy which is derived from the domain ontology. The difference between this method and probability information content is that the evaluation of semantic information content is independent of corpus. Furthermore, semantic information content can be used for FCA concept similarity evaluation, and the weighted bipartite graph is also utilized to help improve the efficiency of the similarity evaluation. The experimental results show that this semantic information content based FCA concept similarity computation method improves the accuracy of probabilistic information content based method effectively without loss of time performance.


Sign in / Sign up

Export Citation Format

Share Document