scholarly journals Embedding Metadata-Enriched Graphs

2022 ◽  
Author(s):  
Stefan Bachhofner ◽  
Peb Ruswono Aryan ◽  
Bernhard Krabina ◽  
Robert David

This paper presents an on-going research where we studythe problem of embedding meta-data enriched graphs, with a focus onknowledge graphs in a vector space with transformer based deep neuralnetworks. Experimentally, we compare ceteris paribus the performance ofa transformer-based model with other non-transformer approaches. Dueto their recent success in natural language processing we hypothesizethat the former is superior in performance. We test this hypothesizesby comparing the performance of transformer embeddings with non-transformer embeddings on different downstream tasks. Our researchmight contribute to a better understanding of how random walks in-fluence the learning of features, which might be useful in the design ofdeep learning architectures for graphs when the input is generated withrandom walks.

Author(s):  
Omer Levy

A fundamental challenge in natural-language processing is to represent words as mathematical entities that can be read, reasoned, and manipulated by computational models. The current leading approach represents words as vectors in a continuous real-valued space, in such a way that similarities in the vector space correlate with semantic similarities between words. This chapter surveys various frameworks and methods for acquiring word vectors, while tying together related ideas and concepts.


Digital ◽  
2021 ◽  
Vol 1 (3) ◽  
pp. 145-161
Author(s):  
Kowshik Bhowmik ◽  
Anca Ralescu

This article presents a systematic literature review on quantifying the proximity between independently trained monolingual word embedding spaces. A search was carried out in the broader context of inducing bilingual lexicons from cross-lingual word embeddings, especially for low-resource languages. The returned articles were then classified. Cross-lingual word embeddings have drawn the attention of researchers in the field of natural language processing (NLP). Although existing methods have yielded satisfactory results for resource-rich languages and languages related to them, some researchers have pointed out that the same is not true for low-resource and distant languages. In this paper, we report the research on methods proposed to provide better representation for low-resource and distant languages in the cross-lingual word embedding space.


2020 ◽  
Vol 34 (05) ◽  
pp. 7448-7455
Author(s):  
Zied Bouraoui ◽  
Jose Camacho-Collados ◽  
Luis Espinosa-Anke ◽  
Steven Schockaert

While many methods for learning vector space embeddings have been proposed in the field of Natural Language Processing, these methods typically do not distinguish between categories and individuals. Intuitively, if individuals are represented as vectors, we can think of categories as (soft) regions in the embedding space. Unfortunately, meaningful regions can be difficult to estimate, especially since we often have few examples of individuals that belong to a given category. To address this issue, we rely on the fact that different categories are often highly interdependent. In particular, categories often have conceptual neighbors, which are disjoint from but closely related to the given category (e.g. fruit and vegetable). Our hypothesis is that more accurate category representations can be learned by relying on the assumption that the regions representing such conceptual neighbors should be adjacent in the embedding space. We propose a simple method for identifying conceptual neighbors and then show that incorporating these conceptual neighbors indeed leads to more accurate region based representations.


Author(s):  
Carlos Periñán-Pascual

AbstractThe development of a model to quantify semantic similarity and relatedness between words has been the major focus of many studies in various fields, e.g. psychology, linguistics, and natural language processing. Unlike the measures proposed by most previous research, this article is aimed at estimating automatically the strength of associative words that can be semantically related or not. We demonstrate that the performance of the model depends not only on the combination of independently constructed word embeddings (namely, corpus- and network-based embeddings) but also on the way these word vectors interact. The research concludes that the weighted average of the cosine-similarity coefficients derived from independent word embeddings in a double vector space tends to yield high correlations with human judgements. Moreover, we demonstrate that evaluating word associations through a measure that relies on not only the rank ordering of word pairs but also the strength of associations can reveal some findings that go unnoticed by traditional measures such as Spearman’s and Pearson’s correlation coefficients.


Sign in / Sign up

Export Citation Format

Share Document