Syntactic Coherence in Word Embedding Spaces

Word embeddings have recently become a vital part of many Natural Language Processing (NLP) systems. Word embeddings are a suite of techniques that represent words in a language as vectors in an n-dimensional real space that has been shown to encode a significant amount of syntactic and semantic information. When used in NLP systems, these representations have resulted in improved performance across a wide range of NLP tasks. However, it is not clear how syntactic properties interact with the more widely studied semantic properties of words. Or what the main factors in the modeling formulation are that encourages embedding spaces to pick up more of syntactic behavior as opposed to semantic behavior of words. We investigate several aspects of word embedding spaces and modeling assumptions that maximize syntactic coherence — the degree to which words with similar syntactic properties form distinct neighborhoods in the embedding space. We do so in order to understand which of the existing models maximize syntactic coherence making it a more reliable source for extracting syntactic category (POS) information. Our analysis shows that syntactic coherence of S-CODE is superior to the other more popular and more recent embedding techniques such as Word2vec, fastText, GloVe and LexVec, when measured under compatible parameter settings. Our investigation also gives deeper insights into the geometry of the embedding space with respect to syntactic coherence, and how this is influenced by context size, frequency of words, and dimensionality of the embedding space.

Download Full-text

Towards Robust Word Embeddings for Noisy Texts

Applied Sciences ◽

10.3390/app10196893 ◽

2020 ◽

Vol 10 (19) ◽

pp. 6893

Author(s):

Yerai Doval ◽

Jesús Vilares ◽

Carlos Gómez-Rodríguez

Keyword(s):

Social Media ◽

Word Embedding ◽

Simple Extension ◽

Word Embeddings ◽

Explicit Approach ◽

Wide Range

Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing from social media. In this work, we propose a simple extension to the skipgram model in which we introduce the concept of bridge-words, which are artificial words added to the model to strengthen the similarity between standard words and their noisy variants. Our new embeddings outperform baseline models on noisy texts on a wide range of evaluation tasks, both intrinsic and extrinsic, while retaining a good performance on standard texts. To the best of our knowledge, this is the first explicit approach at dealing with these types of noisy texts at the word embedding level that goes beyond the support for out-of-vocabulary words.

Download Full-text

Multi-Sense Embeddings per Word

10.31219/osf.io/udfhn ◽

2020 ◽

Author(s):

Masashi Sugiyama

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Research Area ◽

Word Embedding ◽

The Other ◽

Word Embeddings ◽

Word Similarity ◽

Better Than ◽

Non Parametric

Recently, word embeddings have been used in many natural language processing problems successfully and how to train a robust and accurate word embedding system efficiently is a popular research area. Since many, if not all, words have more than one sense, it is necessary to learn vectors for all senses of word separately. Therefore, in this project, we have explored two multi-sense word embedding models, including Multi-Sense Skip-gram (MSSG) model and Non-parametric Multi-sense Skip Gram model (NP-MSSG). Furthermore, we propose an extension of the Multi-Sense Skip-gram model called Incremental Multi-Sense Skip-gram (IMSSG) model which could learn the vectors of all senses per word incrementally. We evaluate all the systems on word similarity task and show that IMSSG is better than the other models.

Download Full-text

Sentiment-Aware Word Embedding for Emotion Classification

Applied Sciences ◽

10.3390/app9071334 ◽

2019 ◽

Vol 9 (7) ◽

pp. 1334 ◽

Cited By ~ 4

Author(s):

Xingliang Mao ◽

Shuai Chang ◽

Jinjing Shi ◽

Fangfang Li ◽

Ronghua Shi

Keyword(s):

Language Processing ◽

Word Embedding ◽

Emotional Word ◽

Word Embeddings ◽

Emotion Classification ◽

Emotional Information ◽

Input Text ◽

Classification Tasks ◽

Emotional Knowledge ◽

Emotional Lexicon

Word embeddings are effective intermediate representations for capturing semantic regularities between words in natural language processing (NLP) tasks. We propose sentiment-aware word embedding for emotional classification, which consists of integrating sentiment evidence within the emotional embedding component of a term vector. We take advantage of the multiple types of emotional knowledge, just as the existing emotional lexicon, to build emotional word vectors to represent emotional information. Then the emotional word vector is combined with the traditional word embedding to construct the hybrid representation, which contains semantic and emotional information as the inputs of the emotion classification experiments. Our method maintains the interpretability of word embeddings, and leverages external emotional information in addition to input text sequences. Extensive results on several machine learning models show that the proposed methods can improve the accuracy of emotion classification tasks.

Download Full-text

A Causal Inference Method for Reducing Gender Bias in Word Embedding Relations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6486 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9434-9441

Author(s):

Zekun Yang ◽

Juan Feng

Keyword(s):

Gender Bias ◽

Language Processing ◽

State Of The Art ◽

Word Embedding ◽

Coreference Resolution ◽

Word Embeddings ◽

Inference Method ◽

Sentence Level ◽

Statistical Dependency ◽

And Gender

Word embedding has become essential for natural language processing as it boosts empirical performances of various tasks. However, recent research discovers that gender bias is incorporated in neural word embeddings, and downstream tasks that rely on these biased word vectors also produce gender-biased results. While some word-embedding gender-debiasing methods have been developed, these methods mainly focus on reducing gender bias associated with gender direction and fail to reduce the gender bias presented in word embedding relations. In this paper, we design a causal and simple approach for mitigating gender bias in word vector relation by utilizing the statistical dependency between gender-definition word embeddings and gender-biased word embeddings. Our method attains state-of-the-art results on gender-debiasing tasks, lexical- and sentence-level evaluation tasks, and downstream coreference resolution tasks.

Download Full-text

NMT Multi-Sense Embeddings per Word

10.31219/osf.io/k623t ◽

2019 ◽

Author(s):

William Jin

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Research Area ◽

Word Embedding ◽

The Other ◽

Word Embeddings ◽

Word Similarity ◽

Better Than ◽

Non Parametric

Download Full-text

Inducing Relational Knowledge from BERT

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6242 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7456-7463 ◽

Cited By ~ 3

Author(s):

Zied Bouraoui ◽

Jose Camacho-Collados ◽

Steven Schockaert

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Language Model ◽

Language Models ◽

Word Embeddings ◽

Relational Knowledge ◽

Wide Range ◽

Fine Tune ◽

Standard Word

One of the most remarkable properties of word embeddings is the fact that they capture certain types of semantic and syntactic relationships. Recently, pre-trained language models such as BERT have achieved groundbreaking results across a wide range of Natural Language Processing tasks. However, it is unclear to what extent such models capture relational knowledge beyond what is already captured by standard word embeddings. To explore this question, we propose a methodology for distilling relational knowledge from a pre-trained language model. Starting from a few seed instances of a given relation, we first use a large text corpus to find sentences that are likely to express this relation. We then use a subset of these extracted sentences as templates. Finally, we fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.

Download Full-text

Getting in Shape: Word Embedding SubSpaces

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/761 ◽

2019 ◽

Author(s):

Tianyuan Zhou ◽

João Sedoc ◽

Jordan Rodu

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Theoretical Framework ◽

Word Embedding ◽

Word Embeddings ◽

Empirical Results ◽

Linear Alignment ◽

The Relationship

Many tasks in natural language processing require the alignment of word embeddings. Embedding alignment relies on the geometric properties of the manifold of word vectors. This paper focuses on supervised linear alignment and studies the relationship between the shape of the target embedding. We assess the performance of aligned word vectors on semantic similarity tasks and find that the isotropy of the target embedding is critical to the alignment. Furthermore, aligning with an isotropic noise can deliver satisfactory results. We provide a theoretical framework and guarantees which aid in the understanding of empirical results.

Download Full-text

Incorporating Extra Knowledge to Enhance Word Embedding

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/686 ◽

2020 ◽

Author(s):

Arpita Roy ◽

Shimei Pan

Keyword(s):

Natural Language ◽

Language Processing ◽

Semantic Relatedness ◽

Word Embedding ◽

Word Embeddings ◽

Future Directions ◽

Mathematical Representations ◽

Text Corpora ◽

Other Information ◽

Semantic Properties

Word embedding, a process to automatically learn the mathematical representations of words from unlabeled text corpora, has gained a lot of attention recently. Since words are the basic units of a natural language, the more precisely we can represent the morphological, syntactic and semantic properties of words, the better we can support downstream Natural Language Processing (NLP) tasks. Since traditional word embeddings are mainly designed to capture the semantic relatedness between co-occurred words in a predefined context, it may not be effective in encoding other information that is important for different NLP applications. In this survey, we summarize the recent advances in incorporating extra knowledge to enhance word embedding. We will also identify the limitations of existing work as well as point out a few promising future directions.

Download Full-text

Word Embedding for the French Natural Language in Health Care: Comparative Study

JMIR Medical Informatics ◽

10.2196/12310 ◽

2019 ◽

Vol 7 (3) ◽

pp. e12310 ◽

Cited By ~ 5

Author(s):

Emeric Dynomant ◽

Romain Lelong ◽

Badisse Dahamna ◽

Clément Massonnaud ◽

Gaétan Kerdelhué ◽

...

Keyword(s):

Natural Language ◽

Language Processing ◽

Feature Learning ◽

Discharge Summary ◽

Word Embedding ◽

University Hospital ◽

Training Time ◽

Formal Evaluation ◽

Wide Range ◽

Human Validation

Background Word embedding technologies, a set of language modeling and feature learning techniques in natural language processing (NLP), are now used in a wide range of applications. However, no formal evaluation and comparison have been made on the ability of each of the 3 current most famous unsupervised implementations (Word2Vec, GloVe, and FastText) to keep track of the semantic similarities existing between words, when trained on the same dataset. Objective The aim of this study was to compare embedding methods trained on a corpus of French health-related documents produced in a professional context. The best method will then help us develop a new semantic annotator. Methods Unsupervised embedding models have been trained on 641,279 documents originating from the Rouen University Hospital. These data are not structured and cover a wide range of documents produced in a clinical setting (discharge summary, procedure reports, and prescriptions). In total, 4 rated evaluation tasks were defined (cosine similarity, odd one, analogy-based operations, and human formal evaluation) and applied on each model, as well as embedding visualization. Results Word2Vec had the highest score on 3 out of 4 rated tasks (analogy-based operations, odd one similarity, and human validation), particularly regarding the skip-gram architecture. Conclusions Although this implementation had the best rate for semantic properties conservation, each model has its own qualities and defects, such as the training time, which is very short for GloVe, or morphological similarity conservation observed with FastText. Models and test sets produced by this study will be the first to be publicly available through a graphical interface to help advance the French biomedical research.

Download Full-text

A Survey of Cross-lingual Word Embedding Models

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.11640 ◽

2019 ◽

Vol 65 ◽

pp. 569-631 ◽

Cited By ~ 19

Author(s):

Sebastian Ruder ◽

Ivan Vulić ◽

Anders Søgaard

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Word Meaning ◽

Word Embedding ◽

Word Embeddings ◽

Objective Functions ◽

Future Challenges ◽

Cross Lingual ◽

Data Requirements

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages. In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent, modulo optimization strategies, hyper-parameters, and such. We also discuss the different ways cross-lingual word embeddings are evaluated, as well as future challenges and research horizons.

Download Full-text