Emotionally Informed Hate Speech Detection: A Multi-target Perspective

Cognitive Computation ◽

10.1007/s12559-021-09862-5 ◽

2021 ◽

Author(s):

Patricia Chiril ◽

Endang Wahyu Pamungkas ◽

Farah Benamara ◽

Véronique Moriceau ◽

Viviana Patti

Keyword(s):

Hate Speech ◽

Binary Classification ◽

Online Communication ◽

Vulnerable Groups ◽

Speech Detection ◽

Social Media Platforms ◽

Sentic Computing ◽

Specific Manifestation ◽

The Impact ◽

First Time

AbstractHate Speech and harassment are widespread in online communication, due to users' freedom and anonymity and the lack of regulation provided by social media platforms. Hate speech is topically focused (misogyny, sexism, racism, xenophobia, homophobia, etc.), and each specific manifestation of hate speech targets different vulnerable groups based on characteristics such as gender (misogyny, sexism), ethnicity, race, religion (xenophobia, racism, Islamophobia), sexual orientation (homophobia), and so on. Most automatic hate speech detection approaches cast the problem into a binary classification task without addressing either the topical focus or the target-oriented nature of hate speech. In this paper, we propose to tackle, for the first time, hate speech detection from a multi-target perspective. We leverage manually annotated datasets, to investigate the problem of transferring knowledge from different datasets with different topical focuses and targets. Our contribution is threefold: (1) we explore the ability of hate speech detection models to capture common properties from topic-generic datasets and transfer this knowledge to recognize specific manifestations of hate speech; (2) we experiment with the development of models to detect both topics (racism, xenophobia, sexism, misogyny) and hate speech targets, going beyond standard binary classification, to investigate how to detect hate speech at a finer level of granularity and how to transfer knowledge across different topics and targets; and (3) we study the impact of affective knowledge encoded in sentic computing resources (SenticNet, EmoSenticNet) and in semantically structured hate lexicons (HurtLex) in determining specific manifestations of hate speech. We experimented with different neural models including multitask approaches. Our study shows that: (1) training a model on a combination of several (training sets from several) topic-specific datasets is more effective than training a model on a topic-generic dataset; (2) the multi-task approach outperforms a single-task model when detecting both the hatefulness of a tweet and its topical focus in the context of a multi-label classification approach; and (3) the models incorporating EmoSenticNet emotions, the first level emotions of SenticNet, a blend of SenticNet and EmoSenticNet emotions or affective features based on Hurtlex, obtained the best results. Our results demonstrate that multi-target hate speech detection from existing datasets is feasible, which is a first step towards hate speech detection for a specific topic/target when dedicated annotated data are missing. Moreover, we prove that domain-independent affective knowledge, injected into our models, helps finer-grained hate speech detection.

Download Full-text

An Ensemble Method for Radicalization and Hate Speech Detection Online Empowered by Sentic Computing

Cognitive Computation ◽

10.1007/s12559-021-09845-6 ◽

2021 ◽

Author(s):

Oscar Araque ◽

Carlos A. Iglesias

Keyword(s):

Hate Speech ◽

Ensemble Method ◽

Speech Detection ◽

Sentic Computing

Download Full-text

Quarantining online hate speech: technical and ethical perspectives

Ethics and Information Technology ◽

10.1007/s10676-019-09516-z ◽

2019 ◽

Vol 22 (1) ◽

pp. 69-80 ◽

Cited By ~ 5

Author(s):

Stefanie Ullmann ◽

Marcus Tomalin

Keyword(s):

Hate Speech ◽

Service Providers ◽

Freedom Of Expression ◽

Computer Software ◽

Speech Detection ◽

Detection Systems ◽

Online Social Media ◽

Psychological Harm ◽

Social Media Platforms ◽

Harmful Content

Abstract In this paper we explore quarantining as a more ethical method for delimiting the spread of Hate Speech via online social media platforms. Currently, companies like Facebook, Twitter, and Google generally respond reactively to such material: offensive messages that have already been posted are reviewed by human moderators if complaints from users are received. The offensive posts are only subsequently removed if the complaints are upheld; therefore, they still cause the recipients psychological harm. In addition, this approach has frequently been criticised for delimiting freedom of expression, since it requires the service providers to elaborate and implement censorship regimes. In the last few years, an emerging generation of automatic Hate Speech detection systems has started to offer new strategies for dealing with this particular kind of offensive online material. Anticipating the future efficacy of such systems, the present article advocates an approach to online Hate Speech detection that is analogous to the quarantining of malicious computer software. If a given post is automatically classified as being harmful in a reliable manner, then it can be temporarily quarantined, and the direct recipients can receive an alert, which protects them from the harmful content in the first instance. The quarantining framework is an example of more ethical online safety technology that can be extended to the handling of Hate Speech. Crucially, it provides flexible options for obtaining a more justifiable balance between freedom of expression and appropriate censorship.

Download Full-text

Rumors on the Net: A Brackish Suspension of Speech and Hate

Law Culture and the Humanities ◽

10.1177/1743872119880121 ◽

2019 ◽

pp. 174387211988012 ◽

Cited By ~ 2

Author(s):

Anne Wagner ◽

Sarah Marusek

Keyword(s):

United States ◽

Social Media ◽

Hate Speech ◽

Public Memory ◽

The United States ◽

Online Social Media ◽

Cyber Culture ◽

Social Media Platforms ◽

The Impact ◽

Mob Mentality

The legitimacy of public memory and socially normative standards of civility is questioned through rumors that abound on online social media platforms. On the Net, the proclivity of rumors is particularly prone to acts of bullying and frameworks of hate speech. Legislative attempts to limit rumors operate differently in France and throughout Europe from the United States. This article examines the impact of online rumors, the mob mentality, and the politicization of bullying critics within a cyber culture that operates within the limitations of law.

Download Full-text

Online Public Shaming Approach using Deep Learning Techniques

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst12675 ◽

2021 ◽

Vol 23 (3) ◽

Author(s):

Mehdi Surani ◽

◽

Ramchandra Mangrulkar ◽

Keyword(s):

Deep Learning ◽

Hate Speech ◽

Severe Depression ◽

Learning System ◽

Online Bullying ◽

Digital World ◽

Learning Techniques ◽

Proposed Model ◽

Social Media Platforms ◽

The Impact

Public shaming on social media platforms like Twitter / Instagram / Facebook etc. have recently increased from the past years. This results in affecting an individual’s social, political, mental and ﬁnancial life. The impact can range from mild bullying to severe depression. With the growing leniency on these social platforms, many people have started misusing the opportunity by turning to online bullying and hate speech. When something is posted online, it stays there forever and it becomes extremely hard taking something out of the digital world. Manually locating and categorizing such comments is a lengthy procedure and just cannot be relied upon. To solve this challenge, automation was performed to identify and classify the shamers. This has been done using the classic SVM model which worked on a given quantity of data. To identify the negative content being posted and discussed online, this paper further explores the deep learning system which can successfully classify these content pieces into proper labels. The text-based Convolution Neural Network (CNN) is the proposed model in this paper for this analysis.

Download Full-text

Dampak Post-Truth di Media Sosial

Jurnal Intelektual: Jurnal Pendidikan dan Studi Keislaman ◽

10.33367/ji.v10i3.1430 ◽

2021 ◽

Vol 10 (3) ◽

pp. 376-378

Author(s):

Nuhdi Futuhal Arifin ◽

A. Jauhar Fuad

Keyword(s):

Social Media ◽

Mass Media ◽

Literature Review ◽

Hate Speech ◽

Social Media Platforms ◽

Religion And Race ◽

The Impact

This article reviews the background of the emergence of Post-truth and the impact of Post-truth. This lecture uses literature review by examining various sources of reading in mass media and social media. The results of this paper explain that in Indonesia Post-truth auctioning and post-election 2019 is rife on social media. Post-truth on social media does not stop there but continues to roll on various problems that exist in this country. Starting is an ethnicity, religion, and race. Hoaxes and hate speech using social media platforms as a means of spreading are not trivial matters, because the series of attacks may continue to surge. The pressure from social media often forms wild and uncontrollable opinions, which are hoped to be exploited by some groups for certain interests.

Download Full-text

Arabic Offensive and Hate Speech Detection Using a Cross-Corpora Multi-Task Learning Model

Informatics ◽

10.3390/informatics8040069 ◽

2021 ◽

Vol 8 (4) ◽

pp. 69

Author(s):

Wassen Aldjanabi ◽

Abdelghani Dahou ◽

Mohammed A. A. Al-qaness ◽

Mohamed Abd Elaziz ◽

Ahmed Mohamed Helmi ◽

...

Keyword(s):

Social Media ◽

Hate Speech ◽

Detection System ◽

Language Model ◽

Arabic Language ◽

Social Phenomena ◽

Speech Detection ◽

Task Learning ◽

Opinion Expression ◽

Social Media Platforms

As social media platforms offer a medium for opinion expression, social phenomena such as hatred, offensive language, racism, and all forms of verbal violence have increased spectacularly. These behaviors do not affect specific countries, groups, or communities only, extending beyond these areas into people’s everyday lives. This study investigates offensive and hate speech on Arab social media to build an accurate offensive and hate speech detection system. More precisely, we develop a classification system for determining offensive and hate speech using a multi-task learning (MTL) model built on top of a pre-trained Arabic language model. We train the MTL model on the same task using cross-corpora representing a variation in the offensive and hate context to learn global and dataset-specific contextual representations. The developed MTL model showed a significant performance and outperformed existing models in the literature on three out of four datasets for Arabic offensive and hate speech detection tasks.

Download Full-text

ETHOS: a multi-label hate speech detection dataset

Complex & Intelligent Systems ◽

10.1007/s40747-021-00608-2 ◽

2022 ◽

Author(s):

Ioannis Mollas ◽

Zoe Chrysopoulou ◽

Stamatis Karlos ◽

Grigorios Tsoumakas

Keyword(s):

Hate Speech ◽

User Interaction ◽

Sampling Procedure ◽

Daily Lives ◽

Speech Detection ◽

Ethical Concerns ◽

Social Media Platforms ◽

High Level ◽

Social Profiles ◽

Detection Mechanisms

AbstractOnline hate speech is a recent problem in our society that is rising at a steady pace by leveraging the vulnerabilities of the corresponding regimes that characterise most social media platforms. This phenomenon is primarily fostered by offensive comments, either during user interaction or in the form of a posted multimedia context. Nowadays, giant corporations own platforms where millions of users log in every day, and protection from exposure to similar phenomena appears to be necessary to comply with the corresponding legislation and maintain a high level of service quality. A robust and reliable system for detecting and preventing the uploading of relevant content will have a significant impact on our digitally interconnected society. Several aspects of our daily lives are undeniably linked to our social profiles, making us vulnerable to abusive behaviours. As a result, the lack of accurate hate speech detection mechanisms would severely degrade the overall user experience, although its erroneous operation would pose many ethical concerns. In this paper, we present ‘ETHOS’ (multi-labEl haTe speecH detectiOn dataSet), a textual dataset with two variants: binary and multi-label, based on YouTube and Reddit comments validated using the Figure-Eight crowdsourcing platform. Furthermore, we present the annotation protocol used to create this dataset: an active sampling procedure for balancing our data in relation to the various aspects defined. Our key assumption is that, even gaining a small amount of labelled data from such a time-consuming process, we can guarantee hate speech occurrences in the examined material.

Download Full-text

Monitoring Users’ Behavior: Anti-Immigration Speech Detection on Twitter

Machine Learning and Knowledge Extraction ◽

10.3390/make2030011 ◽

2020 ◽

Vol 2 (3) ◽

pp. 192-215 ◽

Cited By ~ 1

Author(s):

Nikolaos Pitropakis ◽

Kamil Kokot ◽

Dimitra Gkatzia ◽

Robert Ludwiniak ◽

Alexios Mylonas ◽

...

Keyword(s):

Social Media ◽

Hate Speech ◽

Political Campaigns ◽

Canadian English ◽

Speech Detection ◽

Social Media Platforms ◽

Privacy Breaches ◽

Future Work ◽

Different Sources ◽

Media Data

The proliferation of social media platforms changed the way people interact online. However, engagement with social media comes with a price, the users’ privacy. Breaches of users’ privacy, such as the Cambridge Analytica scandal, can reveal how the users’ data can be weaponized in political campaigns, which many times trigger hate speech and anti-immigration views. Hate speech detection is a challenging task due to the different sources of hate that can have an impact on the language used, as well as the lack of relevant annotated data. To tackle this, we collected and manually annotated an immigration-related dataset of publicly available Tweets in UK, US, and Canadian English. In an empirical study, we explored anti-immigration speech detection utilizing various language features (word n-grams, character n-grams) and measured their impact on a number of trained classifiers. Our work demonstrates that using word n-grams results in higher precision, recall, and f-score as compared to character n-grams. Finally, we discuss the implications of these results for future work on hate-speech detection and social media data analysis in general.

Download Full-text

Hindi-English Hate Speech Detection: Author Profiling, Debiasing, and Practical Perspectives

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5374 ◽

2020 ◽

Vol 34 (01) ◽

pp. 386-393

Author(s):

Shivang Chopra ◽

Ramit Sawhney ◽

Puneet Mathur ◽

Rajiv Ratn Shah

Keyword(s):

Real World ◽

Hate Speech ◽

Linguistically Diverse ◽

Real World Data ◽

Performance Impact ◽

Speech Detection ◽

Social Media Platforms ◽

Real World Datasets ◽

And Performance ◽

Author Profiling

Code-switching in linguistically diverse, low resource languages is often semantically complex and lacks sophisticated methodologies that can be applied to real-world data for precisely detecting hate speech. In an attempt to bridge this gap, we introduce a three-tier pipeline that employs profanity modeling, deep graph embeddings, and author profiling to retrieve instances of hate speech in Hindi-English code-switched language (Hinglish) on social media platforms like Twitter. Through extensive comparison against several baselines on two real-world datasets, we demonstrate how targeted hate embeddings combined with social network-based features outperform state of the art, both quantitatively and qualitatively. Additionally, we present an expert-in-the-loop algorithm for bias elimination in the proposed model pipeline and study the prevalence and performance impact of the debiasing. Finally, we discuss the computational, practical, ethical, and reproducibility aspects of the deployment of our pipeline across the Web.

Download Full-text

Understanding and Interpreting the Impact of User Context in Hate Speech Detection

10.18653/v1/2021.socialnlp-1.8 ◽

2021 ◽

Author(s):

Edoardo Mosca ◽

Maximilian Wich ◽

Georg Groh

Keyword(s):

Hate Speech ◽

Speech Detection ◽

User Context ◽

The Impact

Download Full-text