On Using Composite Word Embeddings To Improve Biomedical Term Similarity

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.

Download Full-text

A Privacy-Preserving Similarity Search Scheme over Encrypted Word Embeddings

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services ◽

10.1145/3366030.3366081 ◽

2019 ◽

Author(s):

Daisuke Aritomo ◽

Chiemi Watanabe ◽

Masaki Matsubara ◽

Atsuyuki Morishima

Keyword(s):

Similarity Search ◽

Privacy Preserving ◽

Word Embeddings

Download Full-text

Development and Evaluation of Novel Ophthalmology Domain-Specific Neural Word Embeddings to Predict Visual Prognosis

International Journal of Medical Informatics ◽

10.1016/j.ijmedinf.2021.104464 ◽

2021 ◽

pp. 104464

Author(s):

Sophia Wang ◽

Benjamin Tseng ◽

Tina Hernandez-Boussard

Keyword(s):

Word Embeddings ◽

Visual Prognosis ◽

Domain Specific

Download Full-text

An Evaluation of Neural Machine Translation and Pre-trained Word Embeddings in Multilingual Neural Sentiment Analysis

2020 IEEE International Conference on Progress in Informatics and Computing (PIC) ◽

10.1109/pic50277.2020.9350849 ◽

2020 ◽

Author(s):

George Manias ◽

Argyro Mavrogiorgou ◽

Athanasios Kiourtis ◽

Dimosthenis Kyriazis

Keyword(s):

Sentiment Analysis ◽

Machine Translation ◽

Word Embeddings ◽

Neural Machine Translation

Download Full-text

A Comparison of Self-Supervised Speech Representations As Input Features For Unsupervised Acoustic Word Embeddings

2021 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt48900.2021.9383625 ◽

2021 ◽

Author(s):

Lisa van Staden ◽

Herman Kamper

Keyword(s):

Word Embeddings

Download Full-text

Learning emotional word embeddings for sentiment analysis

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201993 ◽

2021 ◽

pp. 1-13

Author(s):

Qingtian Zeng ◽

Xishi Zhao ◽

Xiaohui Hu ◽

Hua Duan ◽

Zhongying Zhao ◽

...

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

State Of The Art ◽

Research Problem ◽

Emotional Word ◽

Classification Model ◽

Data Sets ◽

Word Embeddings ◽

Real World Data ◽

Text Documents

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.

Download Full-text