scholarly journals Modelling Semantic Categories Using Conceptual Neighborhood

2020 ◽  
Vol 34 (05) ◽  
pp. 7448-7455
Author(s):  
Zied Bouraoui ◽  
Jose Camacho-Collados ◽  
Luis Espinosa-Anke ◽  
Steven Schockaert

While many methods for learning vector space embeddings have been proposed in the field of Natural Language Processing, these methods typically do not distinguish between categories and individuals. Intuitively, if individuals are represented as vectors, we can think of categories as (soft) regions in the embedding space. Unfortunately, meaningful regions can be difficult to estimate, especially since we often have few examples of individuals that belong to a given category. To address this issue, we rely on the fact that different categories are often highly interdependent. In particular, categories often have conceptual neighbors, which are disjoint from but closely related to the given category (e.g. fruit and vegetable). Our hypothesis is that more accurate category representations can be learned by relying on the assumption that the regions representing such conceptual neighbors should be adjacent in the embedding space. We propose a simple method for identifying conceptual neighbors and then show that incorporating these conceptual neighbors indeed leads to more accurate region based representations.

Author(s):  
Santosh Kumar Mishra ◽  
Rijul Dhir ◽  
Sriparna Saha ◽  
Pushpak Bhattacharyya

Image captioning is the process of generating a textual description of an image that aims to describe the salient parts of the given image. It is an important problem, as it involves computer vision and natural language processing, where computer vision is used for understanding images, and natural language processing is used for language modeling. A lot of works have been done for image captioning for the English language. In this article, we have developed a model for image captioning in the Hindi language. Hindi is the official language of India, and it is the fourth most spoken language in the world, spoken in India and South Asia. To the best of our knowledge, this is the first attempt to generate image captions in the Hindi language. A dataset is manually created by translating well known MSCOCO dataset from English to Hindi. Finally, different types of attention-based architectures are developed for image captioning in the Hindi language. These attention mechanisms are new for the Hindi language, as those have never been used for the Hindi language. The obtained results of the proposed model are compared with several baselines in terms of BLEU scores, and the results show that our model performs better than others. Manual evaluation of the obtained captions in terms of adequacy and fluency also reveals the effectiveness of our proposed approach. Availability of resources : The codes of the article are available at https://github.com/santosh1821cs03/Image_Captioning_Hindi_Language ; The dataset will be made available: http://www.iitp.ac.in/∼ai-nlp-ml/resources.html .


2022 ◽  
Author(s):  
Stefan Bachhofner ◽  
Peb Ruswono Aryan ◽  
Bernhard Krabina ◽  
Robert David

This paper presents an on-going research where we studythe problem of embedding meta-data enriched graphs, with a focus onknowledge graphs in a vector space with transformer based deep neuralnetworks. Experimentally, we compare ceteris paribus the performance ofa transformer-based model with other non-transformer approaches. Dueto their recent success in natural language processing we hypothesizethat the former is superior in performance. We test this hypothesizesby comparing the performance of transformer embeddings with non-transformer embeddings on different downstream tasks. Our researchmight contribute to a better understanding of how random walks in-fluence the learning of features, which might be useful in the design ofdeep learning architectures for graphs when the input is generated withrandom walks.


Computers ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 22
Author(s):  
Frederik Bäumer ◽  
Joschka Kersting ◽  
Michaela Geierhos

The vision of On-the-Fly (OTF) Computing is to compose and provide software services ad hoc, based on requirement descriptions in natural language. Since non-technical users write their software requirements themselves and in unrestricted natural language, deficits occur such as inaccuracy and incompleteness. These deficits are usually met by natural language processing methods, which have to face special challenges in OTF Computing because maximum automation is the goal. In this paper, we present current automatic approaches for solving inaccuracies and incompletenesses in natural language requirement descriptions and elaborate open challenges. In particular, we will discuss the necessity of domain-specific resources and show why, despite far-reaching automation, an intelligent and guided integration of end users into the compensation process is required. In this context, we present our idea of a chat bot that integrates users into the compensation process depending on the given circumstances.


Author(s):  
Omer Levy

A fundamental challenge in natural-language processing is to represent words as mathematical entities that can be read, reasoned, and manipulated by computational models. The current leading approach represents words as vectors in a continuous real-valued space, in such a way that similarities in the vector space correlate with semantic similarities between words. This chapter surveys various frameworks and methods for acquiring word vectors, while tying together related ideas and concepts.


2021 ◽  
pp. 233-252
Author(s):  
Upendar Rao Rayala ◽  
Karthick Seshadri

Sentiment analysis is perceived to be a multi-disciplinary research domain composed of machine learning, artificial intelligence, deep learning, image processing, and social networks. Sentiment analysis can be used to determine opinions of the public about products and to find the customers' interest and their feedback through social networks. To perform any natural language processing task, the input text/comments should be represented in a numerical form. Word embeddings represent the given text/sentences/words as a vector that can be employed in performing subsequent natural language processing tasks. In this chapter, the authors discuss different techniques that can improve the performance of sentiment analysis using concepts and techniques like traditional word embeddings, sentiment embeddings, emoticons, lexicons, and neural networks. This chapter also traces the evolution of word embedding techniques with a chronological discussion of the recent research advancements in word embedding techniques.


Digital ◽  
2021 ◽  
Vol 1 (3) ◽  
pp. 145-161
Author(s):  
Kowshik Bhowmik ◽  
Anca Ralescu

This article presents a systematic literature review on quantifying the proximity between independently trained monolingual word embedding spaces. A search was carried out in the broader context of inducing bilingual lexicons from cross-lingual word embeddings, especially for low-resource languages. The returned articles were then classified. Cross-lingual word embeddings have drawn the attention of researchers in the field of natural language processing (NLP). Although existing methods have yielded satisfactory results for resource-rich languages and languages related to them, some researchers have pointed out that the same is not true for low-resource and distant languages. In this paper, we report the research on methods proposed to provide better representation for low-resource and distant languages in the cross-lingual word embedding space.


Sign in / Sign up

Export Citation Format

Share Document