Specializing Word Embeddings (for Parsing) by Information Bottleneck (Extended Abstract)

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/658 ◽

2020 ◽

Author(s):

Xiang Lisa Li ◽

Jason Eisner

Keyword(s):

Dimensionality Reduction ◽

Semantic Information ◽

State Of The Art ◽

Word Embedding ◽

Discrete Version ◽

Word Embeddings ◽

Continuous Version ◽

Continuous Vector ◽

Information Bottleneck ◽

Art Performance

Pre-trained word embeddings like ELMo and BERT contain rich syntactic and semantic information, resulting in state-of-the-art performance on various tasks. We propose a very fast variational information bottleneck (VIB) method to nonlinearly compress these embeddings, keeping only the information that helps a discriminative parser. We compress each word embedding to either a discrete tag or a continuous vector. In the discrete version, our automatically compressed tags form an alternative tag set: we show experimentally that our tags capture most of the information in traditional POS tag annotations, but our tag sequences can be parsed more accurately at the same level of tag granularity. In the continuous version, we show experimentally that moderately compressing the word embeddings by our method yields a more accurate parser in 8 of 9 languages, unlike simple dimensionality reduction.

Download Full-text

Word Embeddings as Metric Recovery in Semantic Spaces

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00098 ◽

2016 ◽

Vol 4 ◽

pp. 273-286 ◽

Cited By ~ 15

Author(s):

Tatsunori B. Hashimoto ◽

David Alvarez-Melis ◽

Tommi S. Jaakkola

Keyword(s):

Random Walks ◽

Manifold Learning ◽

State Of The Art ◽

Inductive Reasoning ◽

Semantic Space ◽

Word Embedding ◽

Word Embeddings ◽

Recovery Algorithm ◽

Series Completion ◽

Semantic Spaces

Continuous word representations have been remarkably useful across NLP tasks but remain poorly understood. We ground word embeddings in semantic spaces studied in the cognitive-psychometric literature, taking these spaces as the primary objects to recover. To this end, we relate log co-occurrences of words in large corpora to semantic similarity assessments and show that co-occurrences are indeed consistent with an Euclidean semantic space hypothesis. Framing word embedding as metric recovery of a semantic space unifies existing word embedding algorithms, ties them to manifold learning, and demonstrates that existing algorithms are consistent metric recovery methods given co-occurrence counts from random walks. Furthermore, we propose a simple, principled, direct metric recovery algorithm that performs on par with the state-of-the-art word embedding and manifold learning methods. Finally, we complement recent focus on analogies by constructing two new inductive reasoning datasets—series completion and classification—and demonstrate that word embeddings can be used to solve them as well.

Download Full-text

AutoExtend: Combining Word Embeddings with Semantic Resources

Computational Linguistics ◽

10.1162/coli_a_00294 ◽

2017 ◽

Vol 43 (3) ◽

pp. 593-617 ◽

Cited By ~ 4

Author(s):

Sascha Rothe ◽

Hinrich Schütze

Keyword(s):

Semantic Information ◽

State Of The Art ◽

Word Sense Disambiguation ◽

Input Word ◽

Word Sense ◽

Word Embeddings ◽

Training Corpus ◽

Context Similarity ◽

Sense Disambiguation ◽

Semantic Resources

We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings that incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The obtained embeddings live in the same vector space as the input word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet, GermaNet, and Freebase as semantic resources. AutoExtend achieves state-of-the-art performance on Word-in-Context Similarity and Word Sense Disambiguation tasks.

Download Full-text

Appearance and Motion Enhancement for Video-Based Person Re-Identification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6802 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11394-11401

Author(s):

Shuzhao Li ◽

Huimin Yu ◽

Haoji Hu

Keyword(s):

Semantic Information ◽

State Of The Art ◽

Complex Model ◽

The State ◽

Final Model ◽

Backbone Network ◽

Proposed Model ◽

Art Performance ◽

Attribute Recognition

In this paper, we propose an Appearance and Motion Enhancement Model (AMEM) for video-based person re-identification to enrich the two kinds of information contained in the backbone network in a more interpretable way. Concretely, human attribute recognition under the supervision of pseudo labels is exploited in an Appearance Enhancement Module (AEM) to help enrich the appearance and semantic information. A Motion Enhancement Module (MEM) is designed to capture the identity-discriminative walking patterns through predicting future frames. Despite a complex model with several auxiliary modules during training, only the backbone model plus two small branches are kept for similarity evaluation which constitute a simple but effective final model. Extensive experiments conducted on three popular video-based person ReID benchmarks demonstrate the effectiveness of our proposed model and the state-of-the-art performance compared with existing methods.

Download Full-text

A Causal Inference Method for Reducing Gender Bias in Word Embedding Relations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6486 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9434-9441

Author(s):

Zekun Yang ◽

Juan Feng

Keyword(s):

Gender Bias ◽

Language Processing ◽

State Of The Art ◽

Word Embedding ◽

Coreference Resolution ◽

Word Embeddings ◽

Inference Method ◽

Sentence Level ◽

Statistical Dependency ◽

And Gender

Word embedding has become essential for natural language processing as it boosts empirical performances of various tasks. However, recent research discovers that gender bias is incorporated in neural word embeddings, and downstream tasks that rely on these biased word vectors also produce gender-biased results. While some word-embedding gender-debiasing methods have been developed, these methods mainly focus on reducing gender bias associated with gender direction and fail to reduce the gender bias presented in word embedding relations. In this paper, we design a causal and simple approach for mitigating gender bias in word vector relation by utilizing the statistical dependency between gender-definition word embeddings and gender-biased word embeddings. Our method attains state-of-the-art results on gender-debiasing tasks, lexical- and sentence-level evaluation tasks, and downstream coreference resolution tasks.

Download Full-text

Pronunciation-Enhanced Chinese Word Embedding

Cognitive Computation ◽

10.1007/s12559-021-09850-9 ◽

2021 ◽

Author(s):

Qinjuan Yang ◽

Haoran Xie ◽

Gary Cheng ◽

Fu Lee Wang ◽

Yanghui Rao

Keyword(s):

Sentiment Analysis ◽

Text Classification ◽

Semantic Information ◽

Word Embedding ◽

Chinese Characters ◽

Learning Method ◽

Word Embeddings ◽

Chinese Word ◽

Word Similarity ◽

Meaning Structure

AbstractChinese word embeddings have recently garnered considerable attention. Chinese characters and their sub-character components, which contain rich semantic information, are incorporated to learn Chinese word embeddings. Chinese characters can represent a combination of meaning, structure, and pronunciation. However, existing embedding learning methods focus on the structure and meaning of Chinese characters. In this study, we aim to develop an embedding learning method that can make complete use of the information represented by Chinese characters, including phonology, morphology, and semantics. Specifically, we propose a pronunciation-enhanced Chinese word embedding learning method, where the pronunciations of context characters and target characters are simultaneously encoded into the embeddings. Evaluation of word similarity, word analogy reasoning, text classification, and sentiment analysis validate the effectiveness of our proposed method.

Download Full-text

Invariant Representations through Adversarial Forgetting

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5850 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4272-4279

Author(s):

Ayush Jaiswal ◽

Daniel Moyer ◽

Greg Ver Steeg ◽

Wael AbdAlmageed ◽

Premkumar Natarajan

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Empirical Results ◽

Information Bottleneck ◽

Novel Approach ◽

Adversarial Training ◽

Invariant Representations ◽

Art Performance ◽

Forgetting Mechanism

We propose a novel approach to achieving invariance for deep neural networks in the form of inducing amnesia to unwanted factors of data through a new adversarial forgetting mechanism. We show that the forgetting mechanism serves as an information-bottleneck, which is manipulated by the adversarial training to learn invariance to unwanted factors. Empirical results show that the proposed framework achieves state-of-the-art performance at learning invariance in both nuisance and bias settings on a diverse collection of datasets and tasks.

Download Full-text

Near-Lossless Binarization of Word Embeddings

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017104 ◽

2019 ◽

Vol 33 ◽

pp. 7104-7111 ◽

Cited By ~ 3

Author(s):

Julien Tissier ◽

Christophe Gravier ◽

Amaury Habrard

Keyword(s):

Sentiment Analysis ◽

Semantic Similarity ◽

Text Classification ◽

Semantic Information ◽

State Of The Art ◽

Floating Point ◽

Word Embeddings ◽

Binary Vectors ◽

Starting Point ◽

Memory Footprint

Word embeddings are commonly used as a starting point in many NLP models to achieve state-of-the-art performances. However, with a large vocabulary and many dimensions, these floating-point representations are expensive both in terms of memory and calculations which makes them unsuitable for use on low-resource devices. The method proposed in this paper transforms real-valued embeddings into binary embeddings while preserving semantic information, requiring only 128 or 256 bits for each vector. This leads to a small memory footprint and fast vector operations. The model is based on an autoencoder architecture, which also allows to reconstruct original vectors from the binary ones. Experimental results on semantic similarity, text classification and sentiment analysis tasks show that the binarization of word embeddings only leads to a loss of ∼2% in accuracy while vector size is reduced by 97%. Furthermore, a top-k benchmark demonstrates that using these binary vectors is 30 times faster than using real-valued vectors.

Download Full-text

ACV-tree: A New Method for Sentence Similarity Modeling

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/575 ◽

2018 ◽

Cited By ~ 7

Author(s):

Yuquan Le ◽

Zhi-Jie Wang ◽

Zhe Quan ◽

Jiawei He ◽

Bin Yao

Keyword(s):

Language Processing ◽

Semantic Information ◽

State Of The Art ◽

Word Embeddings ◽

The Core ◽

Tree Kernel ◽

Network Methods ◽

Syntactic Information ◽

Sentence Similarity ◽

Attention Weight

Sentence similarity modeling lies at the core of many natural language processing applications, and thus has received much attention. Owing to the success of word embeddings, recently, popular neural network methods have achieved sentence embedding, obtaining attractive performance. Nevertheless, most of them focused on learning semantic information and modeling it as a continuous vector, while the syntactic information of sentences has not been fully exploited. On the other hand, prior works have shown the benefits of structured trees that include syntactic information, while few methods in this branch utilized the advantages of word embeddings and another powerful technique ? attention weight mechanism. This paper makes the first attempt to absorb their advantages by merging these techniques in a unified structure, dubbed as ACV-tree. Meanwhile, this paper develops a new tree kernel, known as ACVT kernel, that is tailored for sentence similarity measure based on the proposed structure. The experimental results, based on 19 widely-used datasets, demonstrate that our model is effective and competitive, compared against state-of-the-art models.

Download Full-text

Key phrase Extraction by Improving TextRank with an Integration of Word Embedding and Syntactic Information

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200820155846 ◽

2020 ◽

Vol 13 ◽

Author(s):

Sheng Zhang ◽

Qi Luo ◽

Yukun Feng ◽

Ke Ding ◽

Daniela Gifu ◽

...

Keyword(s):

Semantic Information ◽

Performance Enhancement ◽

Word Embedding ◽

The Other ◽

Test Set ◽

Pagerank Algorithm ◽

Phrase Extraction ◽

Extraction Algorithm ◽

Syntactic Information ◽

Key Phrase Extraction

Background: As a known key phrase extraction algorithm, TextRank is an analogue of PageRank algorithm, which relied heavily on the statistics of term frequency in the manner of co-occurrence analysis. Objective: The frequency-based characteristic made it a neck-bottle for performance enhancement, and various improved TextRank algorithms were proposed in the recent years. Most of improvements incorporated semantic information into key phrase extraction algorithm and achieved improvement. Method: In this research, taking both syntactic and semantic information into consideration, we integrated syntactic tree algorithm and word embedding and put forward an algorithm of Word Embedding and Syntactic Information Algorithm (WESIA), which improved the accuracy of the TextRank algorithm. Results: By applying our method on a self-made test set and a public test set, the result implied that the proposed unsupervised key phrase extraction algorithm outperformed the other algorithms to some extent.

Download Full-text

Learning emotional word embeddings for sentiment analysis

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201993 ◽

2021 ◽

pp. 1-13

Author(s):

Qingtian Zeng ◽

Xishi Zhao ◽

Xiaohui Hu ◽

Hua Duan ◽

Zhongying Zhao ◽

...

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

State Of The Art ◽

Research Problem ◽

Emotional Word ◽

Classification Model ◽

Data Sets ◽

Word Embeddings ◽

Real World Data ◽

Text Documents

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.

Download Full-text