Mention Recommendation for Twitter with End-to-end Memory Network

In this study, we investigated the problem of recommending usernames when people attempt to use the ``@'' sign to mention other people in twitter-like social media. With the extremely rapid development of social networking services, this problem has received considerable attention in recent years. Previous methods have studied the problem from different aspects. Because most of Twitter-like microblogging services limit the length of posts, statistical learning methods may be affected by the problems of word sparseness and synonyms. Although recent progress in neural word embedding methods have advanced the state-of-the-art in many natural language processing tasks, the benefits of word embedding have not been taken into consideration for this problem. In this work, we proposed a novel end-to-end memory network architecture to perform this task. We incorporated the interests of users with external memory. A hierarchical attention mechanism was also applied to better consider the interests of users. The experimental results on a dataset we collected from Twitter demonstrated that the proposed method could outperform state-of-the-art approaches.

Download Full-text

A Polarity Capturing Sphere for Word to Vector Representation

Applied Sciences ◽

10.3390/app10124386 ◽

2020 ◽

Vol 10 (12) ◽

pp. 4386 ◽

Cited By ~ 1

Author(s):

Sandra Rizkallah ◽

Amir F. Atiya ◽

Samir Shaheen

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

State Of The Art ◽

Unrelated Word ◽

Research Field ◽

Word Embedding ◽

Vector Representation ◽

Active Research ◽

Embedding Methods ◽

Better Than

Embedding words from a dictionary as vectors in a space has become an active research field, due to its many uses in several natural language processing applications. Distances between the vectors should reflect the relatedness between the corresponding words. The problem with existing word embedding methods is that they often fail to distinguish between synonymous, antonymous, and unrelated word pairs. Meanwhile, polarity detection is crucial for applications such as sentiment analysis. In this work we propose an embedding approach that is designed to capture the polarity issue. The approach is based on embedding the word vectors into a sphere, whereby the dot product between any vectors represents the similarity. Vectors corresponding to synonymous words would be close to each other on the sphere, while a word and its antonym would lie at opposite poles of the sphere. The approach used to design the vectors is a simple relaxation algorithm. The proposed word embedding is successful in distinguishing between synonyms, antonyms, and unrelated word pairs. It achieves results that are better than those of some of the state-of-the-art techniques and competes well with the others.

Download Full-text

Convolution–deconvolution word embedding: An end-to-end multi-prototype fusion embedding method for natural language processing

Information Fusion ◽

10.1016/j.inffus.2019.06.009 ◽

2020 ◽

Vol 53 ◽

pp. 112-122 ◽

Cited By ~ 9

Author(s):

Kai Shuang ◽

Zhixuan Zhang ◽

Jonathan Loo ◽

Sen Su

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Word Embedding ◽

Embedding Method ◽

End To End

Download Full-text

A transformer-based approach to irony and sarcasm detection

Neural Computing and Applications ◽

10.1007/s00521-020-05102-3 ◽

2020 ◽

Vol 32 (23) ◽

pp. 17309-17320

Author(s):

Rolandos Alexandros Potamias ◽

Georgios Siolas ◽

Andreas - Georgios Stafylopatis

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Architecture ◽

Figurative Language ◽

State Of The Art ◽

Unresolved Issue ◽

Discussion Forums ◽

Large Margin ◽

Neural Architecture ◽

Benchmark Datasets

AbstractFigurative language (FL) seems ubiquitous in all social media discussion forums and chats, posing extra challenges to sentiment analysis endeavors. Identification of FL schemas in short texts remains largely an unresolved issue in the broader field of natural language processing, mainly due to their contradictory and metaphorical meaning content. The main FL expression forms are sarcasm, irony and metaphor. In the present paper, we employ advanced deep learning methodologies to tackle the problem of identifying the aforementioned FL forms. Significantly extending our previous work (Potamias et al., in: International conference on engineering applications of neural networks, Springer, Berlin, pp 164–175, 2019), we propose a neural network methodology that builds on a recently proposed pre-trained transformer-based network architecture which is further enhanced with the employment and devise of a recurrent convolutional neural network. With this setup, data preprocessing is kept in minimum. The performance of the devised hybrid neural architecture is tested on four benchmark datasets, and contrasted with other relevant state-of-the-art methodologies and systems. Results demonstrate that the proposed methodology achieves state-of-the-art performance under all benchmark datasets, outperforming, even by a large margin, all other methodologies and published studies.

Download Full-text

A Causal Inference Method for Reducing Gender Bias in Word Embedding Relations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6486 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9434-9441

Author(s):

Zekun Yang ◽

Juan Feng

Keyword(s):

Gender Bias ◽

Language Processing ◽

State Of The Art ◽

Word Embedding ◽

Coreference Resolution ◽

Word Embeddings ◽

Inference Method ◽

Sentence Level ◽

Statistical Dependency ◽

And Gender

Word embedding has become essential for natural language processing as it boosts empirical performances of various tasks. However, recent research discovers that gender bias is incorporated in neural word embeddings, and downstream tasks that rely on these biased word vectors also produce gender-biased results. While some word-embedding gender-debiasing methods have been developed, these methods mainly focus on reducing gender bias associated with gender direction and fail to reduce the gender bias presented in word embedding relations. In this paper, we design a causal and simple approach for mitigating gender bias in word vector relation by utilizing the statistical dependency between gender-definition word embeddings and gender-biased word embeddings. Our method attains state-of-the-art results on gender-debiasing tasks, lexical- and sentence-level evaluation tasks, and downstream coreference resolution tasks.

Download Full-text

Generating Senses and RoLes: An End-to-End Model for Dependency- and Span-based Semantic Role Labeling

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/521 ◽

2021 ◽

Author(s):

Rexhina Blloshmi ◽

Simone Conia ◽

Rocco Tripodi ◽

Roberto Navigli

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Great Success ◽

Semantic Role ◽

Semantic Role Labeling ◽

Complex Predicate ◽

Input Sentence ◽

End To End

Despite the recent great success of the sequence-to-sequence paradigm in Natural Language Processing, the majority of current studies in Semantic Role Labeling (SRL) still frame the problem as a sequence labeling task. In this paper we go against the flow and propose GSRL (Generating Senses and RoLes), the first sequence-to-sequence model for end-to-end SRL. Our approach benefits from recently-proposed decoder-side pretraining techniques to generate both sense and role labels for all the predicates in an input sentence at once, in an end-to-end fashion. Evaluated on standard gold benchmarks, GSRL achieves state-of-the-art results in both dependency- and span-based English SRL, proving empirically that our simple generation-based model can learn to produce complex predicate-argument structures. Finally, we propose a framework for evaluating the robustness of an SRL model in a variety of synthetic low-resource scenarios which can aid human annotators in the creation of better, more diverse, and more challenging gold datasets. We release GSRL at github.com/SapienzaNLP/gsrl.

Download Full-text

Joint Character-Level Word Embedding and Adversarial Stability Training to Defend Adversarial Text

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6356 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8384-8391

Author(s):

Hui Liu ◽

Yongzheng Zhang ◽

Yipeng Wang ◽

Zheng Lin ◽

Yige Chen

Keyword(s):

Language Processing ◽

Text Classification ◽

State Of The Art ◽

Word Embedding ◽

Data Sets ◽

Basic Task ◽

Gradient Based ◽

Adversarial Examples ◽

Stability Training ◽

Adversarial Example

Text classification is a basic task in natural language processing, but the small character perturbations in words can greatly decrease the effectiveness of text classification models, which is called character-level adversarial example attack. There are two main challenges in character-level adversarial examples defense, which are out-of-vocabulary words in word embedding model and the distribution difference between training and inference. Both of these two challenges make the character-level adversarial examples difficult to defend. In this paper, we propose a framework which jointly uses the character embedding and the adversarial stability training to overcome these two challenges. Our experimental results on five text classification data sets show that the models based on our framework can effectively defend character-level adversarial examples, and our models can defend 93.19% gradient-based adversarial examples and 94.83% natural adversarial examples, which outperforms the state-of-the-art defense models.

Download Full-text

Efficient Weighted Semantic Score Based on the Huffman Coding Algorithm and Knowledge Bases for Word Sequences Embedding

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2020040107 ◽

2020 ◽

Vol 16 (2) ◽

pp. 126-142

Author(s):

Nada Ben-Lhachemi ◽

El Habib Nfaoui

Keyword(s):

Language Processing ◽

Recommendation System ◽

Semantic Relatedness ◽

Knowledge Bases ◽

Word Embedding ◽

Huffman Coding ◽

Text Representation ◽

Text Data ◽

New Feature ◽

Embedding Methods

Learning text representation is forming a core for numerous natural language processing applications. Word embedding is a type of text representation that allows words with similar meaning to have similar representation. Word embedding techniques categorize semantic similarities between linguistic items based on their distributional properties in large samples of text data. Although these techniques are very efficient, handling semantic and pragmatics ambiguity with high accuracy is still a challenging research task. In this article, we propose a new feature as a semantic score which handles ambiguities between words. We use external knowledge bases and the Huffman Coding algorithm to compute this score that depicts the semantic relatedness between all fragments composing a given text. We combine this feature with word embedding methods to improve text representation. We evaluate our method on a hashtag recommendation system in Twitter where text is noisy and short. The experimental results demonstrate that, compared with state-of-the-art algorithms, our method achieves good results.

Download Full-text

Preliminary Results on Different Text Processing Tasks Using Encoder-Decoder Networks and the Causal Feature Extractor

Applied Sciences ◽

10.3390/app10175772 ◽

2020 ◽

Vol 10 (17) ◽

pp. 5772

Author(s):

Adrián Javaloy ◽

Ginés García-Mateos

Keyword(s):

Language Processing ◽

Network Architecture ◽

State Of The Art ◽

Text Processing ◽

Short Term ◽

Training Time ◽

Preliminary Results ◽

Feature Extractor ◽

Decoder Architecture ◽

Different Types

Deep learning methods are gaining popularity in different application domains, and especially in natural language processing. It is commonly believed that using a large enough dataset and an adequate network architecture, almost any processing problem can be solved. A frequent and widely used typology is the encoder-decoder architecture, where the input data is transformed into an intermediate code by means of an encoder, and then a decoder takes this code to produce its output. Different types of networks can be used in the encoder and the decoder, depending on the problem of interest, such as convolutional neural networks (CNN) or long-short term memories (LSTM). This paper uses for the encoder a method recently proposed, called Causal Feature Extractor (CFE). It is based on causal convolutions (i.e., convolutions that depend only on one direction of the input), dilatation (i.e., increasing the aperture size of the convolutions) and bidirectionality (i.e., independent networks in both directions). Some preliminary results are presented on three different tasks and compared with state-of-the-art methods: bilingual translation, LaTeX decompilation and audio transcription. The proposed method achieves promising results, showing its ubiquity to work with text, audio and images. Moreover, it has a shorter training time, requiring less time per iteration, and a good use of the attention mechanisms based on attention matrices.

Download Full-text

Enhancing clinical concept extraction with contextual embeddings

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocz096 ◽

2019 ◽

Vol 26 (11) ◽

pp. 1297-1304 ◽

Cited By ~ 29

Author(s):

Yuqi Si ◽

Jingqi Wang ◽

Hua Xu ◽

Kirk Roberts

Keyword(s):

Language Processing ◽

Semantic Information ◽

Medical Information ◽

State Of The Art ◽

Language Model ◽

The State ◽

Concept Extraction ◽

The Impact ◽

Clinical Concept ◽

Embedding Methods

Abstract Objective Neural network–based representations (“embeddings”) have dramatically advanced natural language processing (NLP) tasks, including clinical NLP tasks such as concept extraction. Recently, however, more advanced embedding methods and representations (eg, ELMo, BERT) have further pushed the state of the art in NLP, yet there are no common best practices for how to integrate these representations into clinical tasks. The purpose of this study, then, is to explore the space of possible options in utilizing these new models for clinical concept extraction, including comparing these to traditional word embedding methods (word2vec, GloVe, fastText). Materials and Methods Both off-the-shelf, open-domain embeddings and pretrained clinical embeddings from MIMIC-III (Medical Information Mart for Intensive Care III) are evaluated. We explore a battery of embedding methods consisting of traditional word embeddings and contextual embeddings and compare these on 4 concept extraction corpora: i2b2 2010, i2b2 2012, SemEval 2014, and SemEval 2015. We also analyze the impact of the pretraining time of a large language model like ELMo or BERT on the extraction performance. Last, we present an intuitive way to understand the semantic information encoded by contextual embeddings. Results Contextual embeddings pretrained on a large clinical corpus achieves new state-of-the-art performances across all concept extraction tasks. The best-performing model outperforms all state-of-the-art methods with respective F1-measures of 90.25, 93.18 (partial), 80.74, and 81.65. Conclusions We demonstrate the potential of contextual embeddings through the state-of-the-art performance these methods achieve on clinical concept extraction. Additionally, we demonstrate that contextual embeddings encode valuable semantic information not accounted for in traditional word representations.

Download Full-text

Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems

Applied Sciences ◽

10.3390/app10217711 ◽

2020 ◽

Vol 10 (21) ◽

pp. 7711

Author(s):

Arthur Flor de Sousa Neto ◽

Byron Leite Dantas Bezerra ◽

Alejandro Héctor Toselli

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Network Architecture ◽

State Of The Art ◽

Language Models ◽

Text Recognition ◽

Spelling Correction ◽

Handwritten Text ◽

Handwritten Text Recognition

The increasing portability of physical manuscripts to the digital environment makes it common for systems to offer automatic mechanisms for offline Handwritten Text Recognition (HTR). However, several scenarios and writing variations bring challenges in recognition accuracy, and, to minimize this problem, optical models can be used with language models to assist in decoding text. Thus, with the aim of improving results, dictionaries of characters and words are generated from the dataset and linguistic restrictions are created in the recognition process. In this way, this work proposes the use of spelling correction techniques for text post-processing to achieve better results and eliminate the linguistic dependence between the optical model and the decoding stage. In addition, an encoder–decoder neural network architecture in conjunction with a training methodology are developed and presented to achieve the goal of spelling correction. To demonstrate the effectiveness of this new approach, we conducted an experiment on five datasets of text lines, widely known in the field of HTR, three state-of-the-art Optical Models for text recognition and eight spelling correction techniques, among traditional statistics and current approaches of neural networks in the field of Natural Language Processing (NLP). Finally, our proposed spelling correction model is analyzed statistically through HTR system metrics, reaching an average sentence correction of 54% higher than the state-of-the-art method of decoding in the tested datasets.

Download Full-text