scholarly journals Automated Short-Answer Grading using Semantic Similarity based on Word Embedding

2021 ◽  
Vol 12 (3) ◽  
pp. 571
Author(s):  
Fetty Fitriyanti Lubis ◽  
Mutaqin Mutaqin ◽  
Atina Putri ◽  
Dana Waskita ◽  
Tri Sulistyaningtyas ◽  
...  
Author(s):  
Tianyuan Zhou ◽  
João Sedoc ◽  
Jordan Rodu

Many tasks in natural language processing require the alignment of word embeddings. Embedding alignment relies on the geometric properties of the manifold of word vectors. This paper focuses on supervised linear alignment and studies the relationship between the shape of the target embedding. We assess the performance of aligned word vectors on semantic similarity tasks and find that the isotropy of the target embedding is critical to the alignment. Furthermore, aligning with an isotropic noise can deliver satisfactory results. We provide a theoretical framework and guarantees which aid in the understanding of empirical results.


2021 ◽  
pp. 1-12
Author(s):  
Fuqiang Zhao ◽  
Zhengyu Zhu ◽  
Ping Han

To measure semantic similarity between words, a novel model DFRVec that encodes multiple semantic information of a word in WordNet into a vector space is presented in this paper. Firstly, three different sub-models are proposed: 1) DefVec: encoding the definitions of a word in WordNet; 2) FormVec: encoding the part-of-speech (POS) of a word in WordNet; 3) RelVec: encoding the relations of a word in WordNet. Then by combining the three sub-models with an existing word embedding, the new model for generating the vector of a word is proposed. Finally, based on DFRVec and the path information in WordNet, a new method DFRVec+Path to measure semantic similarity between words is presented. The experiments on ten benchmark datasets show that DFRVec+Path can outperform many existing methods on semantic similarity measurement.


2020 ◽  
Author(s):  
Remo De Oliveira Gresta ◽  
Elder Cirilo

Identifiers are one of the most important sources of domain information in software development. Therefore, it is recognized that the proper use of names directly impacts the code's comprehensibility, maintainability, and quality. Our goal in this work is to expand the current knowledge about names by considering not only their quality but also their contextual similarity. To achieve that, we extracted names of four large scale open-source projects written in Java. Then, we computed the semantic similarity between classes and their attributes/variables using Fasttext, an word embedding algorithm. As a result, we could observe that source code, in general, preserve an acceptable level of contextual similarity, developers avoid to use names out of the default dictionary (e.g., domain), and files with more changes and maintained by distinct contributors tend to have better a contextual similarity.


2021 ◽  
Vol 193 ◽  
pp. 92-101
Author(s):  
MD. Asif Iqbal ◽  
Omar Sharif ◽  
Mohammed Moshiul Hoque ◽  
Iqbal H. Sarkar

Sign in / Sign up

Export Citation Format

Share Document