A New Sentence Similarity Computing Technique Using Order and Semantic Similarity

Author(s):  
Nityam Agarwal ◽  
Poorvi Seth ◽  
Merin Meleet
2013 ◽  
Vol 718-720 ◽  
pp. 2248-2251
Author(s):  
Pei Ying Zhang

FAQ system is a question answering system which finds the question sentence from question-answer collection and then returns its corresponding answer to user. The task of matching questions to corresponding question-answer pairs has become a major challenge in FAQ system. This paper proposes a method for sentence similarity metric between questions according to its semantic similarity as well as the length of question length. Experiments show that this method can improve the accuracy and intelligence of answering system, has some practical value.


2014 ◽  
Vol 1049-1050 ◽  
pp. 1514-1517
Author(s):  
Sai Dong Lv ◽  
Ji Li Xie

Subjective question marking system at present is affected by the attention of people, the subjective topic grading principles are common contrast degree of exam questions similar to those of the reference answer, and based on the improved semantic similarity algorithm, calculation of sentence similarity, the similarity degree of exam questions and reference answer is obtained, thus give scores.And design based on semantic similarity experiment, the experiment results show that the proposed multi-level fusion similarity calculation method to improve the original method, on the basis of integration advantages of various methods, make the calculation results meet the requirements of the scoring system.


Author(s):  
Mourad Oussalah ◽  
Muhidin Mohamed

AbstractDetermining the extent to which two text snippets are semantically equivalent is a well-researched topic in the areas of natural language processing, information retrieval and text summarization. The sentence-to-sentence similarity scoring is extensively used in both generic and query-based summarization of documents as a significance or a similarity indicator. Nevertheless, most of these applications utilize the concept of semantic similarity measure only as a tool, without paying importance to the inherent properties of such tools that ultimately restrict the scope and technical soundness of the underlined applications. This paper aims to contribute to fill in this gap. It investigates three popular WordNet hierarchical semantic similarity measures, namely path-length, Wu and Palmer and Leacock and Chodorow, from both algebraical and intuitive properties, highlighting their inherent limitations and theoretical constraints. We have especially examined properties related to range and scope of the semantic similarity score, incremental monotonicity evolution, monotonicity with respect to hyponymy/hypernymy relationship as well as a set of interactive properties. Extension from word semantic similarity to sentence similarity has also been investigated using a pairwise canonical extension. Properties of the underlined sentence-to-sentence similarity are examined and scrutinized. Next, to overcome inherent limitations of WordNet semantic similarity in terms of accounting for various Part-of-Speech word categories, a WordNet “All word-To-Noun conversion” that makes use of Categorial Variation Database (CatVar) is put forward and evaluated using a publicly available dataset with a comparison with some state-of-the-art methods. The finding demonstrates the feasibility of the proposal and opens up new opportunities in information retrieval and natural language processing tasks.


2020 ◽  
Author(s):  
M Krishna Siva Prasad ◽  
Poonam Sharma

Abstract Short text or sentence similarity is crucial in various natural language processing activities. Traditional measures for sentence similarity consider word order, semantic features and role annotations of text to derive the similarity. These measures do not suit short texts or sentences with negation. Hence, this paper proposes an approach to determine the semantic similarity of sentences and also presents an algorithm to handle negation. In sentence similarity, word pair similarity plays a significant role. Hence, this paper also discusses the similarity between word pairs. Existing semantic similarity measures do not handle antonyms accurately. Hence, this paper proposes an algorithm to handle antonyms. This paper also presents an antonym dataset with 111-word pairs and corresponding expert ratings. The existing semantic similarity measures are tested on the dataset. The results of the correlation proved that the expert ratings are in order with the correlation obtained from the semantic similarity measures. The sentence similarity is handled by proposing two algorithms. The first algorithm deals with the typical sentences, and the second algorithm deals with contradiction in the sentences. SICK dataset, which has sentences with negation, is considered for handling the sentence similarity. The algorithm helped in improving the results of sentence similarity.


Author(s):  
Gandhis Ulta Abriani ◽  
Muhammad Ainul Yaqin

Perhitungan sentence similarity dilakukan dengan menghitung nilai kemiripan antar katanya. Pada beberapa penelitian yang telah dilakukan sebelumnya, perhitungan sentence similarity hanya berhenti pada nilai kemiripan antar kata pada kalimat sebagai nilai akhirnya. Sedangkan pada perhitungan sentence similarity bertujuan untuk menghitung nilai kemiripan keseluruhan dalam bentuk kalimat menjadi satu nilai kemiripan secara utuh sebagai hasil akhir. Perhitungan word similarity berdasarkan contextual-nya dihitung menggunakan Word Similarity for Java (WS4J) dengan tiga pendekatan yaitu wu palmer, lin, dan path. Menghitung nilai kemiripan menggunakan WS4J hanya dapat diuraikan berdasarkan kemiripan antar kata, sehingga dilakukan pembobotan menggunakan metode Analytical Hierarchy Process (AHP). Ada dua aspek yang digunakan sebagai kriteria pada perhitungan AHP untuk menentukan nilai bobot kriterianya yaitu noun, dan verb. Dari hasil tersebut, kemudian dilakukan pembobotan dengan mengakumulasikan matriks nilai word similarity dengan nilai bobot kriteria masing-masing untuk memperoleh nilai sentence similarity.


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Xiao Li ◽  
Qingsheng Li

Combined with the problem of single direction of the solution of the existing sentence similarity algorithms, an algorithm for sentence semantic similarity based on syntactic structure was proposed. Firstly, analyze the sentence constituent, then through analysis convert sentence similarity into words similarity on the basis of syntactic structure, then convert words similarity into concept similarity through words disambiguation, and, finally, realize the semantic similarity comparison. It also gives the comparison rules in more detail for the modifier words in the sentence which also have certain contributions to the sentence. Under the same test condition, the experiments show that the proposed algorithm is more intuitive understanding of people and has higher accuracy.


2018 ◽  
Vol 2 (2) ◽  
pp. 70-82 ◽  
Author(s):  
Binglu Wang ◽  
Yi Bu ◽  
Win-bin Huang

AbstractIn the field of scientometrics, the principal purpose for author co-citation analysis (ACA) is to map knowledge domains by quantifying the relationship between co-cited author pairs. However, traditional ACA has been criticized since its input is insufficiently informative by simply counting authors’ co-citation frequencies. To address this issue, this paper introduces a new method that reconstructs the raw co-citation matrices by regarding document unit counts and keywords of references, named as Document- and Keyword-Based Author Co-Citation Analysis (DKACA). Based on the traditional ACA, DKACA counted co-citation pairs by document units instead of authors from the global network perspective. Moreover, by incorporating the information of keywords from cited papers, DKACA captured their semantic similarity between co-cited papers. In the method validation part, we implemented network visualization and MDS measurement to evaluate the effectiveness of DKACA. Results suggest that the proposed DKACA method not only reveals more insights that are previously unknown but also improves the performance and accuracy of knowledge domain mapping, representing a new basis for further studies.


Sign in / Sign up

Export Citation Format

Share Document