Word Similarity Calculation by Using the Edit Distance Metrics with Consonant Normalization

2012 ◽  
Vol 155-156 ◽  
pp. 375-380 ◽  
Author(s):  
Wu Ling Ren ◽  
Jin Ju Guo

To make the word similarity calculated results more reasonable and accurate, a new word similarity algorithm is proposed. It uses HowNet primitive hierarchical tree structure, and calculates the two primitives’ distance with the method computing WordNet node distance which considers the tree depth, density, path and connecting intensity, etc. Moreover, algorithm also improves the method that distance into similarity. Finally, this algorithm is compared with related algorithms through experiment. The results show that the proposed algorithm effectively improves the precision and accuracy of word similarity calculation.


2020 ◽  
Vol 309 ◽  
pp. 03004
Author(s):  
Ying Wang ◽  
Xiwei Feng ◽  
Yue Zhang ◽  
Haiming Chen ◽  
Lijie Xing

This paper explores an improved method for the semantic similarity calculation of words combined with HowNet and CiLin. Firstly, we designing the algorithm based on HowNet’s sememe similarity improvement calculation, comprehensively considering the influence of each part of sememe on the overall meaning, and improving the calculation of word similarity based on HowNet by changing the specific calculation method of each part of sememe. At the same time, we adopt different strategies for the different results obtained in the similarity calculation of CiLin. The experimental RG data set proves that the modified Pearson coefficient of the method reaches 0.87.


2014 ◽  
Vol 7 (1) ◽  
pp. 48 ◽  
Author(s):  
Huihui Zhang ◽  
Zhengtao Yu ◽  
Longhua Shen ◽  
Jianyi Guo ◽  
Xudong Hong

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Qifeng Gong

The application of artificial intelligence in the field of English needs to process a large amount of English text data, but the deviation of English word similarity reduces its overall English translation accuracy and data processing efficiency. Therefore, this paper proposes an accurate estimation of English word similarity based on semantic network, which combines a variety of computing methods to form a compound computing structure based on semantic network. The experimental results show that the error between the Semantic Web-based English word similarity calculation method and manual evaluation is small, and the accuracy of English word similarity calculation is improved to a certain extent. In addition, compared with other English word similarity calculation methods, the English word similarity calculation method based on semantic network is more in line with people’s cognition and understanding of knowledge, has higher reliability, and has certain practical value in the field of English.


2009 ◽  
Vol 5 (2) ◽  
pp. 57
Author(s):  
Gábor Kovács ◽  
Gábor Árpád Németh ◽  
Zoltán Pap ◽  
Mahadevan Subramaniam

This paper proposes a string edit distance based test selection method to generate compact test sets for telecommunications software. Following the results of previous research, a trace in a test set is considered to be redundant if its edit distance from others is less than a given parameter. The algorithm first determines the minimum cardinality of the target test set inaccordance with the provided parameter, then it selects the test set with the highest sum of internal edit distances. The selection problem is reduced to an assignment problem in bipartite graphs.


2014 ◽  
Author(s):  
Ryan Cotterell ◽  
Nanyun Peng ◽  
Jason Eisner
Keyword(s):  

2012 ◽  
Vol 12 ◽  
Author(s):  
Amanda Post Silveira

This is a preliminary study in which we investigate the acquisition of English as second language (L2[1]) word stress by native speakers of Brazilian Portuguese (BP, L1[2]). In this paper, we show results of a multiple choice forced choice perception test in which native speakers of American English and native speakers of Dutch judged the production of English words bearing pre-final stress that were both cognates and non-cognates with BP words. The tokens were produced by native speakers of American English and by Brazilians that speak English as a second language. The results have shown that American and Dutch listeners were consistent in their judgments on native and non-native stress productions and both speakers' groups produced variation in stress in relation to the canonical pattern. However, the variability found in American English points to the prosodic patterns of English and the variability found in Brazilian English points to the stress patterns of Portuguese. It occurs especially in words whose forms activate neighboring similar words in the L1. Transfer from the L1 appears both at segmental and prosodic levels in BP English. [1] L2 stands for second language, foreign language, target language. [2] L1 stands for first language, mother tongue, source language.


Sign in / Sign up

Export Citation Format

Share Document