Time of Your Hate: The Challenge of Time in Hate Speech Detection on Social Media

The availability of large annotated corpora from social media and the development of powerful classification approaches have contributed in an unprecedented way to tackle the challenge of monitoring users’ opinions and sentiments in online social platforms across time. Such linguistic data are strongly affected by events and topic discourse, and this aspect is crucial when detecting phenomena such as hate speech, especially from a diachronic perspective. We address this challenge by focusing on a real case study: the “Contro l’odio” platform for monitoring hate speech against immigrants in the Italian Twittersphere. We explored the temporal robustness of a BERT model for Italian (AlBERTo), the current benchmark on non-diachronic detection settings. We tested different training strategies to evaluate how the classification performance is affected by adding more data temporally distant from the test set and hence potentially different in terms of topic and language use. Our analysis points out the limits that a supervised classification model encounters on data that are heavily influenced by events. Our results show how AlBERTo is highly sensitive to the temporal distance of the fine-tuning set. However, with an adequate time window, the performance increases, while requiring less annotated data than a traditional classifier.

Download Full-text

Ensemble Method for Indonesian Twitter Hate Speech Detection

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v11.i1.pp294-299 ◽

2018 ◽

Vol 11 (1) ◽

pp. 294 ◽

Cited By ~ 8

Author(s):

M. Ali Fauzi ◽

Anny Yuniarti

Keyword(s):

Social Media ◽

Hate Speech ◽

Ensemble Methods ◽

Classification Performance ◽

Ensemble Method ◽

Support Vector ◽

Web Content ◽

Speech Detection ◽

Social Media Networks ◽

Nearest Neighbours

Due to the massive increase of user-generated web content, in particular on social media networks where anyone can give a statement freely without any limitations, the amount of hateful activities is also increasing. Social media and microblogging web services, such as Twitter, allowing to read and analyze user tweets in near real time. Twitter is a logical source of data for hate speech analysis since users of twitter are more likely to express their emotions of an event by posting some tweet. This analysis can help for early identification of hate speech so it can be prevented to be spread widely. The manual way of classifying out hateful contents in twitter is costly and not scalable. Therefore, the automatic way of hate speech detection is needed to be developed for tweets in Indonesian language. In this study, we used ensemble method for hate speech detection in Indonesian language. We employed five stand-alone classification algorithms, including Naïve Bayes, K-Nearest Neighbours, Maximum Entropy, Random Forest, and Support Vector Machines, and two ensemble methods, hard voting and soft voting, on Twitter hate speech dataset. The experiment results showed that using ensemble method can improve the classification performance. The best result is achieved when using soft voting with F1 measure 79.8% on unbalance dataset and 84.7% on balanced dataset. Although the improvement is not truly remarkable, using ensemble method can reduce the jeopardy of choosing a poor classifier to be used for detecting new tweets as hate speech or not.

Download Full-text

Multimodal Hate Speech Detection in Greek Social Media

Multimodal Technologies and Interaction ◽

10.3390/mti5070034 ◽

2021 ◽

Vol 5 (7) ◽

pp. 34

Author(s):

Konstantinos Perifanos ◽

Dionysis Goutsos

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Hate Speech ◽

Language Model ◽

Fine Tuning ◽

Accuracy Score ◽

Speech Detection ◽

Online Social Media

Hateful and abusive speech presents a major challenge for all online social media platforms. Recent advances in Natural Language Processing and Natural Language Understanding allow for more accurate detection of hate speech in textual streams. This study presents a new multimodal approach to hate speech detection by combining Computer Vision and Natural Language processing models for abusive context detection. Our study focuses on Twitter messages and, more specifically, on hateful, xenophobic, and racist speech in Greek aimed at refugees and migrants. In our approach, we combine transfer learning and fine-tuning of Bidirectional Encoder Representations from Transformers (BERT) and Residual Neural Networks (Resnet). Our contribution includes the development of a new dataset for hate speech classification, consisting of tweet IDs, along with the code to obtain their visual appearance, as they would have been rendered in a web browser. We have also released a pre-trained Language Model trained on Greek tweets, which has been used in our experiments. We report a consistently high level of accuracy (accuracy score = 0.970, f1-score = 0.947 in our best model) in racist and xenophobic speech detection.

Download Full-text

Ethical and technical challenges of AI in tackling hate speech

The International Review of Information Ethics ◽

10.29173/irie416 ◽

2021 ◽

Vol 29 ◽

Author(s):

Diogo Cortiz ◽

Arkaitz Zubiaga

Keyword(s):

Social Media ◽

Large Scale ◽

Information Overload ◽

Scientific Literature ◽

Hate Speech ◽

Freedom Of Expression ◽

Speech Detection ◽

Detection Algorithms ◽

Technical Challenges

In this paper, we discuss some of the ethical and technical challenges of using Artificial Intelligence for online content moderation. As a case study, we used an AI model developed to detect hate speech on social networks, a concept for which varying definitions are given in the scientific literature and consensus is lacking. We argue that while AI can play a central role in dealing with information overload on social media, it could cause risks of violating freedom of expression (if the project is not well conducted). We present some ethical and technical challenges involved in the entire pipeline of an AI project - from data collection to model evaluation - that hinder the large-scale use of hate speech detection algorithms. Finally, we argue that AI can assist with the detection of hate speech in social media, provided that the final judgment about the content has to be made through a process with human involvement.

Download Full-text

Ensemble-based Semi-Supervised Learning for Hate Speech Detection

The International FLAIRS Conference Proceedings ◽

10.32473/flairs.v34i1.128427 ◽

2021 ◽

Vol 34 (1) ◽

Author(s):

Safa Alsafari

Keyword(s):

Social Media ◽

Supervised Learning ◽

Hate Speech ◽

Classification Performance ◽

Media Content ◽

Learning Approach ◽

Classification Methods ◽

Speech Detection ◽

Speech Classification

Large and accurately labeled textual corpora are vital to developing efficient hate speech classifiers. This paper introduces an ensemble-based semi-supervised learning approach to leverage the availability of abundant social media content. Starting with a reliable hate speech dataset, we train and test diverse classifiers that are then used to label a corpus of one million tweets. Next, we investigate several strategies to select the most confident labels from the obtained pseudo labels. We assess these strategies by re-training all the classifiers with the seed dataset augmented with the trusted pseudo-labeled data. Finally, we demonstrate that our approach improves classification performance over supervised hate speech classification methods.

Download Full-text

A Web Interface for Analyzing Hate Speech

Future Internet ◽

10.3390/fi13030080 ◽

2021 ◽

Vol 13 (3) ◽

pp. 80

Author(s):

Lazaros Vrysis ◽

Nikolaos Vryzas ◽

Rigas Kotsakis ◽

Theodora Saridou ◽

Maria Matsiola ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Graphical User Interface ◽

Hate Speech ◽

Web Interface ◽

Learning Models ◽

Speech Detection ◽

Media Services ◽

The Web ◽

Machine Learning Models

Social media services make it possible for an increasing number of people to express their opinion publicly. In this context, large amounts of hateful comments are published daily. The PHARM project aims at monitoring and modeling hate speech against refugees and migrants in Greece, Italy, and Spain. In this direction, a web interface for the creation and the query of a multi-source database containing hate speech-related content is implemented and evaluated. The selected sources include Twitter, YouTube, and Facebook comments and posts, as well as comments and articles from a selected list of websites. The interface allows users to search in the existing database, scrape social media using keywords, annotate records through a dedicated platform and contribute new content to the database. Furthermore, the functionality for hate speech detection and sentiment analysis of texts is provided, making use of novel methods and machine learning models. The interface can be accessed online with a graphical user interface compatible with modern internet browsers. For the evaluation of the interface, a multifactor questionnaire was formulated, targeting to record the users’ opinions about the web interface and the corresponding functionality.

Download Full-text

To BAN or Not to BAN: Bayesian Attention Networks for Reliable Hate Speech Detection

Cognitive Computation ◽

10.1007/s12559-021-09826-9 ◽

2021 ◽

Author(s):

Kristian Miok ◽

Blaž Škrlj ◽

Daniela Zaharie ◽

Marko Robnik-Šikonja

Keyword(s):

Monte Carlo ◽

Hate Speech ◽

Classification Performance ◽

Reliability Estimation ◽

Superior Performance ◽

Speech Detection ◽

Attention Networks ◽

Reliability Estimates ◽

Viable Mechanism ◽

Affective Dimensions

AbstractHate speech is an important problem in the management of user-generated content. To remove offensive content or ban misbehaving users, content moderators need reliable hate speech detectors. Recently, deep neural networks based on the transformer architecture, such as the (multilingual) BERT model, have achieved superior performance in many natural language classification tasks, including hate speech detection. So far, these methods have not been able to quantify their output in terms of reliability. We propose a Bayesian method using Monte Carlo dropout within the attention layers of the transformer models to provide well-calibrated reliability estimates. We evaluate and visualize the results of the proposed approach on hate speech detection problems in several languages. Additionally, we test whether affective dimensions can enhance the information extracted by the BERT model in hate speech classification. Our experiments show that Monte Carlo dropout provides a viable mechanism for reliability estimation in transformer networks. Used within the BERT model, it offers state-of-the-art classification performance and can detect less trusted predictions.

Download Full-text

Online Multilingual Hate Speech Detection: Experimenting with Hindi and English Social Media

10.20944/preprints202011.0646.v1 ◽

2020 ◽

Author(s):

Neeraj Vashistha ◽

Arkaitz Zubiaga

Keyword(s):

Social Media ◽

Hate Speech ◽

Model Performance ◽

Academic Community ◽

Human Interaction ◽

Superior Performance ◽

Competitive Performance ◽

Speech Detection ◽

Improve Model ◽

Use Of The Internet

The exponential increase in the use of the Internet and social media over the last two decades has changed human interaction. This has led to many positive outcomes, but at the same time it has brought risks and harms. While the volume of harmful content online, such as hate speech, is not manageable by humans, interest in the academic community to investigate automated means for hate speech detection has increased. In this study, we analyse six publicly available datasets by combining them into a single homogeneous dataset and classify them into three classes, abusive, hateful or neither. We create a baseline model and we improve model performance scores using various optimisation techniques. After attaining a competitive performance score, we create a tool which identifies and scores a page with effective metric in near-real time and uses the same as feedback to re-train our model. We prove the competitive performance of our multilingual model on two langauges, English and Hindi, leading to comparable or superior performance to most monolingual models.

Download Full-text

Terrorist Speech under Bills C-51 and C-59 and the Othman Hamdan Case: The Continued Incoherence of Canada’s Approach

Alberta Law Review ◽

10.29173/alr2574 ◽

2019 ◽

pp. 203

Author(s):

Kent Roach

Keyword(s):

Social Media ◽

Hate Speech ◽

Criminal Code ◽

Immigration Law ◽

The Internet ◽

Media Companies ◽

Violent Extremism ◽

Fundamental Freedoms

It is argued that neither the approach taken to terrorist speech in Bill C-51 nor Bill C-59 is satisfactory. A case study of the Othman Hamdan case, including his calls on the Internet for “lone wolves” “swiftly to activate,” is featured, along with the use of immigration law after his acquittal for counselling murder and other crimes. Hamdan’s acquittal suggests that the new Bill C-59 terrorist speech offence and take-down powers based on counselling terrorism offences without specifying a particular terrorism offence may not reach Hamdan’s Internet postings. One coherent response would be to repeal terrorist speech offences while making greater use of court-ordered take-downs of speech on the Internet and programs to counter violent extremism. Another coherent response would be to criminalize the promotion and advocacy of terrorist activities (as opposed to terrorist offences in general in Bill C-51 or terrorism offences without identifying a specific terrorist offence in Bill C-59) and provide for defences designed to protect fundamental freedoms such as those under section 319(3) of the Criminal Code that apply to hate speech. Unfortunately, neither Bill C-51 nor Bill C-59 pursues either of these options. The result is that speech such as Hamdan’s will continue to be subject to the vagaries of take-downs by social media companies and immigration law.

Download Full-text