Word Similarity Computing Based on HowNet and Synonymy Thesaurus

Author(s):  
Hongmei Nie ◽  
Jiaqing Zhou ◽  
Hui Wang ◽  
Minshuo Li
Keyword(s):  
2012 ◽  
Vol 12 ◽  
Author(s):  
Amanda Post Silveira

This is a preliminary study in which we investigate the acquisition of English as second language (L2[1]) word stress by native speakers of Brazilian Portuguese (BP, L1[2]). In this paper, we show results of a multiple choice forced choice perception test in which native speakers of American English and native speakers of Dutch judged the production of English words bearing pre-final stress that were both cognates and non-cognates with BP words. The tokens were produced by native speakers of American English and by Brazilians that speak English as a second language. The results have shown that American and Dutch listeners were consistent in their judgments on native and non-native stress productions and both speakers' groups produced variation in stress in relation to the canonical pattern. However, the variability found in American English points to the prosodic patterns of English and the variability found in Brazilian English points to the stress patterns of Portuguese. It occurs especially in words whose forms activate neighboring similar words in the L1. Transfer from the L1 appears both at segmental and prosodic levels in BP English. [1] L2 stands for second language, foreign language, target language. [2] L1 stands for first language, mother tongue, source language.


Author(s):  
Herry Sujaini

Extended Word Similarity Based (EWSB) Clustering is a word clustering algorithm based on the value of words similarity obtained from the computation of a corpus. One of the benefits of clustering with this algorithm is to improve the translation of a statistical machine translation. Previous research proved that EWSB algorithm could improve the Indonesian-English translator, where the algorithm was applied to Indonesian language as target language.This paper discusses the results of a research using EWSB algorithm on a Indonesian to Minang statistical machine translator, where the algorithm is applied to Minang language as the target language. The research obtained resulted that the EWSB algorithm is quite effective when used in Minang language as the target language. The results of this study indicate that EWSB algorithm can improve the translation accuracy by 6.36%.


Author(s):  
Kun Wang ◽  
Yanpeng Cui ◽  
Jianwei Hu ◽  
Yu Zhang ◽  
Wei Zhao ◽  
...  

2020 ◽  
Vol 16 (3) ◽  
pp. 263-290
Author(s):  
Hui Guan ◽  
Chengzhen Jia ◽  
Hongji Yang

Since computing semantic similarity tends to simulate the thinking process of humans, semantic dissimilarity must play a part in this process. In this paper, we present a new approach for semantic similarity measuring by taking consideration of dissimilarity into the process of computation. Specifically, the proposed measures explore the potential antonymy in the hierarchical structure of WordNet to represent the dissimilarity between concepts and then combine the dissimilarity with the results of existing methods to achieve semantic similarity results. The relation between parameters and the correlation value is discussed in detail. The proposed model is then applied to different text granularity levels to validate the correctness on similarity measurement. Experimental results show that the proposed approach not only achieves high correlation value against human ratings but also has effective improvement to existing path-distance based methods on the word similarity level, in the meanwhile effectively correct existing sentence similarity method in some cases in Microsoft Research Paraphrase Corpus and SemEval-2014 date set.


2014 ◽  
Vol 701-702 ◽  
pp. 413-417
Author(s):  
Jie Ran ◽  
Ji Ya Huang ◽  
Zu Xiao

Word similarity computing is a crucial question in information processing technology. In this paper, an integrated word similarity computing method is proposed by analyzed morpheme's similarity, word order's similarity and word length's similarity, and parameters of the method are decided by experiments. The experiments show that this method has high efficiency.


Author(s):  
Yiling Liu ◽  
Yangsheng Ji ◽  
Chong Gu ◽  
Shouling Cui ◽  
Jiangtao Jia

2020 ◽  
Vol 8 ◽  
pp. 311-329
Author(s):  
Kushal Arora ◽  
Aishik Chakraborty ◽  
Jackie C. K. Cheung

In this paper, we propose LexSub, a novel approach towards unifying lexical and distributional semantics. We inject knowledge about lexical-semantic relations into distributional word embeddings by defining subspaces of the distributional vector space in which a lexical relation should hold. Our framework can handle symmetric attract and repel relations (e.g., synonymy and antonymy, respectively), as well as asymmetric relations (e.g., hypernymy and meronomy). In a suite of intrinsic benchmarks, we show that our model outperforms previous approaches on relatedness tasks and on hypernymy classification and detection, while being competitive on word similarity tasks. It also outperforms previous systems on extrinsic classification tasks that benefit from exploiting lexical relational cues. We perform a series of analyses to understand the behaviors of our model. 1 Code available at https://github.com/aishikchakraborty/LexSub .


Sign in / Sign up

Export Citation Format

Share Document