Comparison of Various Word Embeddings for Hate-Speech Detection

Morphological Skip-Gram: Replacing FastText characters n-gram with morphological knowledge

INTELIGENCIA ARTIFICIAL ◽

10.4114/intartif.vol24iss67pp1-17 ◽

2021 ◽

Vol 24 (67) ◽

pp. 1-17

Author(s):

Flávio Arthur O. Santos ◽

Thiago Dias Bispo ◽

Hendrik Teixeira Macedo ◽

Cleber Zanchettin

Keyword(s):

Language Processing ◽

Hate Speech ◽

Named Entity Recognition ◽

Entity Recognition ◽

Training Phase ◽

Word Embeddings ◽

Speech Detection ◽

Word Representation ◽

N Gram ◽

Good Word

Natural language processing systems have attracted much interest of the industry. This branch of study is composed of some applications such as machine translation, sentiment analysis, named entity recognition, question and answer, and others. Word embeddings (i.e., continuous word representations) are an essential module for those applications generally used as word representation to machine learning models. Some popular methods to train word embeddings are GloVe and Word2Vec. They achieve good word representations, despite limitations: both ignore morphological information of the words and consider only one representation vector for each word. This approach implies the word embeddings does not consider different word contexts properly and are unaware of its inner structure. To mitigate this problem, the other word embeddings method FastText represents each word as a bag of characters n-grams. Hence, a continuous vector describes each n-gram, and the final word representation is the sum of its characters n-grams vectors. Nevertheless, the use of all n-grams character of a word is a poor approach since some n-grams have no semantic relation with their words and increase the amount of potentially useless information. This approach also increase the training phase time. In this work, we propose a new method for training word embeddings, and its goal is to replace the FastText bag of character n-grams for a bag of word morphemes through the morphological analysis of the word. Thus, words with similar context and morphemes are represented by vectors close to each other. To evaluate our new approach, we performed intrinsic evaluations considering 15 different tasks, and the results show a competitive performance compared to FastText. Moreover, the proposed model is $40\%$ faster than FastText in the training phase. We also outperform the baseline approaches in extrinsic evaluations through Hate speech detection and NER tasks using different scenarios.

Download Full-text

Racist and Sexist Hate Speech Detection: Literature Review

2020 International Conference on Intelligent Data Science Technologies and Applications (IDSTA) ◽

10.1109/idsta50958.2020.9264052 ◽

2020 ◽

Author(s):

Othman Istaiteh ◽

Razan Al-Omoush ◽

Sara Tedmori

Keyword(s):

Literature Review ◽

Hate Speech ◽

Speech Detection

Download Full-text

An Ensemble Method for Radicalization and Hate Speech Detection Online Empowered by Sentic Computing

Cognitive Computation ◽

10.1007/s12559-021-09845-6 ◽

2021 ◽

Author(s):

Oscar Araque ◽

Carlos A. Iglesias

Keyword(s):

Hate Speech ◽

Ensemble Method ◽

Speech Detection ◽

Sentic Computing

Download Full-text

SWE2: SubWord Enriched and Significant Word Emphasized Framework for Hate Speech Detection

Proceedings of the 29th ACM International Conference on Information & Knowledge Management ◽

10.1145/3340531.3411990 ◽

2020 ◽

Author(s):

Guanyi Mou ◽

Pengyi Ye ◽

Kyumin Lee

Keyword(s):

Hate Speech ◽

Speech Detection

Download Full-text

Twitter Hate Speech Detection using Stacked Weighted Ensemble (SWE) Model

2020 Fifth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN) ◽

10.1109/icrcicn50933.2020.9296199 ◽

2020 ◽

Author(s):

Sujatha Arun Kokatnoor ◽

Balachandran Krishnan

Keyword(s):

Hate Speech ◽

Speech Detection

Download Full-text

Study on BERT Model for Hate Speech Detection

2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) ◽

10.1109/iceca49313.2020.9297560 ◽

2020 ◽

Author(s):

Shailja Gupta ◽

Sachin Lakra ◽

Manpreet Kaur

Keyword(s):

Hate Speech ◽

Speech Detection

Download Full-text

Hate Speech Detection on Twitter Using Multinomial Logistic Regression Classification Method

2019 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS) ◽

10.1109/iotais47347.2019.8980379 ◽

2019 ◽

Cited By ~ 1

Author(s):

Purnama Sari Br Ginting ◽

Budhi Irawan ◽

Casi Setianingsih

Keyword(s):

Logistic Regression ◽

Hate Speech ◽

Multinomial Logistic Regression ◽

Classification Method ◽

Speech Detection

Download Full-text

DeepHate: Hate Speech Detection via Multi-Faceted Text Representations

12th ACM Conference on Web Science ◽

10.1145/3394231.3397890 ◽

2020 ◽

Cited By ~ 1

Author(s):

Rui Cao ◽

Roy Ka-Wei Lee ◽

Tuan-Anh Hoang

Keyword(s):

Hate Speech ◽

Speech Detection

Download Full-text

A Web Interface for Analyzing Hate Speech

Future Internet ◽

10.3390/fi13030080 ◽

2021 ◽

Vol 13 (3) ◽

pp. 80

Author(s):

Lazaros Vrysis ◽

Nikolaos Vryzas ◽

Rigas Kotsakis ◽

Theodora Saridou ◽

Maria Matsiola ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Graphical User Interface ◽

Hate Speech ◽

Web Interface ◽

Learning Models ◽

Speech Detection ◽

Media Services ◽

The Web ◽

Machine Learning Models

Social media services make it possible for an increasing number of people to express their opinion publicly. In this context, large amounts of hateful comments are published daily. The PHARM project aims at monitoring and modeling hate speech against refugees and migrants in Greece, Italy, and Spain. In this direction, a web interface for the creation and the query of a multi-source database containing hate speech-related content is implemented and evaluated. The selected sources include Twitter, YouTube, and Facebook comments and posts, as well as comments and articles from a selected list of websites. The interface allows users to search in the existing database, scrape social media using keywords, annotate records through a dedicated platform and contribute new content to the database. Furthermore, the functionality for hate speech detection and sentiment analysis of texts is provided, making use of novel methods and machine learning models. The interface can be accessed online with a graphical user interface compatible with modern internet browsers. For the evaluation of the interface, a multifactor questionnaire was formulated, targeting to record the users’ opinions about the web interface and the corresponding functionality.

Download Full-text

To BAN or Not to BAN: Bayesian Attention Networks for Reliable Hate Speech Detection

Cognitive Computation ◽

10.1007/s12559-021-09826-9 ◽

2021 ◽

Author(s):

Kristian Miok ◽

Blaž Škrlj ◽

Daniela Zaharie ◽

Marko Robnik-Šikonja

Keyword(s):

Monte Carlo ◽

Hate Speech ◽

Classification Performance ◽

Reliability Estimation ◽

Superior Performance ◽

Speech Detection ◽

Attention Networks ◽

Reliability Estimates ◽

Viable Mechanism ◽

Affective Dimensions

AbstractHate speech is an important problem in the management of user-generated content. To remove offensive content or ban misbehaving users, content moderators need reliable hate speech detectors. Recently, deep neural networks based on the transformer architecture, such as the (multilingual) BERT model, have achieved superior performance in many natural language classification tasks, including hate speech detection. So far, these methods have not been able to quantify their output in terms of reliability. We propose a Bayesian method using Monte Carlo dropout within the attention layers of the transformer models to provide well-calibrated reliability estimates. We evaluate and visualize the results of the proposed approach on hate speech detection problems in several languages. Additionally, we test whether affective dimensions can enhance the information extracted by the BERT model in hate speech classification. Our experiments show that Monte Carlo dropout provides a viable mechanism for reliability estimation in transformer networks. Used within the BERT model, it offers state-of-the-art classification performance and can detect less trusted predictions.

Download Full-text