scholarly journals Word similarity metrics and multilateral comparison

Author(s):  
Brett Kessler
1997 ◽  
Vol 102 (5) ◽  
pp. 3189-3189 ◽  
Author(s):  
Paul Iverson ◽  
Edward T. Auer ◽  
Lynne E. Bernstein

2019 ◽  
Vol 78 (14) ◽  
pp. 1249-1261
Author(s):  
O. Rubel ◽  
S. K. Abramov ◽  
V. V. Abramova ◽  
V. V. Lukin

2012 ◽  
Vol 12 ◽  
Author(s):  
Amanda Post Silveira

This is a preliminary study in which we investigate the acquisition of English as second language (L2[1]) word stress by native speakers of Brazilian Portuguese (BP, L1[2]). In this paper, we show results of a multiple choice forced choice perception test in which native speakers of American English and native speakers of Dutch judged the production of English words bearing pre-final stress that were both cognates and non-cognates with BP words. The tokens were produced by native speakers of American English and by Brazilians that speak English as a second language. The results have shown that American and Dutch listeners were consistent in their judgments on native and non-native stress productions and both speakers' groups produced variation in stress in relation to the canonical pattern. However, the variability found in American English points to the prosodic patterns of English and the variability found in Brazilian English points to the stress patterns of Portuguese. It occurs especially in words whose forms activate neighboring similar words in the L1. Transfer from the L1 appears both at segmental and prosodic levels in BP English. [1] L2 stands for second language, foreign language, target language. [2] L1 stands for first language, mother tongue, source language.


Author(s):  
Herry Sujaini

Extended Word Similarity Based (EWSB) Clustering is a word clustering algorithm based on the value of words similarity obtained from the computation of a corpus. One of the benefits of clustering with this algorithm is to improve the translation of a statistical machine translation. Previous research proved that EWSB algorithm could improve the Indonesian-English translator, where the algorithm was applied to Indonesian language as target language.This paper discusses the results of a research using EWSB algorithm on a Indonesian to Minang statistical machine translator, where the algorithm is applied to Minang language as the target language. The research obtained resulted that the EWSB algorithm is quite effective when used in Minang language as the target language. The results of this study indicate that EWSB algorithm can improve the translation accuracy by 6.36%.


2021 ◽  
Vol 11 (5) ◽  
pp. 2040-2049
Author(s):  
Vinaya Kumar Katneni ◽  
Mudagandur S. Shekhar ◽  
Ashok Kumar Jangam ◽  
Balasubramanian C. Paran ◽  
Ashok Selvaraj ◽  
...  

Author(s):  
Kun Wang ◽  
Yanpeng Cui ◽  
Jianwei Hu ◽  
Yu Zhang ◽  
Wei Zhao ◽  
...  

2020 ◽  
Vol 16 (3) ◽  
pp. 263-290
Author(s):  
Hui Guan ◽  
Chengzhen Jia ◽  
Hongji Yang

Since computing semantic similarity tends to simulate the thinking process of humans, semantic dissimilarity must play a part in this process. In this paper, we present a new approach for semantic similarity measuring by taking consideration of dissimilarity into the process of computation. Specifically, the proposed measures explore the potential antonymy in the hierarchical structure of WordNet to represent the dissimilarity between concepts and then combine the dissimilarity with the results of existing methods to achieve semantic similarity results. The relation between parameters and the correlation value is discussed in detail. The proposed model is then applied to different text granularity levels to validate the correctness on similarity measurement. Experimental results show that the proposed approach not only achieves high correlation value against human ratings but also has effective improvement to existing path-distance based methods on the word similarity level, in the meanwhile effectively correct existing sentence similarity method in some cases in Microsoft Research Paraphrase Corpus and SemEval-2014 date set.


2021 ◽  
Vol 25 (4) ◽  
pp. 763-787
Author(s):  
Alladoumbaye Ngueilbaye ◽  
Hongzhi Wang ◽  
Daouda Ahmat Mahamat ◽  
Ibrahim A. Elgendy ◽  
Sahalu B. Junaidu

Knowledge extraction, data mining, e-learning or web applications platforms use heterogeneous and distributed data. The proliferation of these multifaceted platforms faces many challenges such as high scalability, the coexistence of complex similarity metrics, and the requirement of data quality evaluation. In this study, an extended complete formal taxonomy and some algorithms that utilize in achieving the detection and correction of contextual data quality anomalies were developed and implemented on structured data. Our methods were effective in detecting and correcting more data anomalies than existing taxonomy techniques, and also highlighted the demerit of Support Vector Machine (SVM). These proposed techniques, therefore, will be of relevance in detection and correction of errors in large contextual data (Big data).


Sign in / Sign up

Export Citation Format

Share Document