Detecting Hate Speech in Cross-Lingual and Multi-lingual Settings Using Language Agnostic Representations

Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g., hate speech, cyberbullying, and cyberaggression). The clear majority of these studies deal with English partially because most annotated datasets available contain English data. In this article, we take advantage of available English datasets by applying cross-lingual contextual word embeddings and transfer learning to make predictions in low-resource languages. We project predictions on comparable data in Arabic, Bengali, Danish, Greek, Hindi, Spanish, and Turkish. We report results of 0.8415 F1 macro for Bengali in TRAC-2 shared task [23], 0.8532 F1 macro for Danish and 0.8701 F1 macro for Greek in OffensEval 2020 [58], 0.8568 F1 macro for Hindi in HASOC 2019 shared task [27], and 0.7513 F1 macro for Spanish in in SemEval-2019 Task 5 (HatEval) [7], showing that our approach compares favorably to the best systems submitted to recent shared tasks on these three languages. Additionally, we report competitive performance on Arabic and Turkish using the training and development sets of OffensEval 2020 shared task. The results for all languages confirm the robustness of cross-lingual contextual embeddings and transfer learning for this task.

Download Full-text

A joint learning approach with knowledge injection for zero-shot cross-lingual hate speech detection

Information Processing & Management ◽

10.1016/j.ipm.2021.102544 ◽

2021 ◽

Vol 58 (4) ◽

pp. 102544

Author(s):

Endang Wahyu Pamungkas ◽

Valerio Basile ◽

Viviana Patti

Keyword(s):

Hate Speech ◽

Learning Approach ◽

Joint Learning ◽

Speech Detection ◽

Cross Lingual

Download Full-text

Exposing the limits of Zero-shot Cross-lingual Hate Speech Detection

10.18653/v1/2021.acl-short.114 ◽

2021 ◽

Author(s):

Debora Nozza

Keyword(s):

Hate Speech ◽

Speech Detection ◽

Cross Lingual

Download Full-text

Cross-lingual Capsule Network for Hate Speech Detection in Social Media

10.1145/3465336.3475102 ◽

2021 ◽

Author(s):

Aiqi Jiang ◽

Arkaitz Zubiaga

Keyword(s):

Social Media ◽

Hate Speech ◽

Speech Detection ◽

Cross Lingual

Download Full-text

Letter to Mark Zuckerberg on Hate Speech

PsycEXTRA Dataset ◽

10.1037/e507152020-001 ◽

2020 ◽

Author(s):

Arthur C. Evans

Keyword(s):

Hate Speech

Download Full-text

Propensity Score Matching in Cross-Lingual Test Equating

PsycEXTRA Dataset ◽

10.1037/e662962012-001 ◽

2012 ◽

Author(s):

Xin Liu ◽

Xiaobin Zhou ◽

Jianjun Zhu ◽

Jing-Jen Wang

Keyword(s):

Propensity Score ◽

Propensity Score Matching ◽

Test Equating ◽

Cross Lingual

Download Full-text

Exploring the assessment of hate speech and other complaints by the BCCSA

Ecquid Novi African Journalism Studies ◽

10.3368/ajs.28.1-2.30 ◽

2007 ◽

Vol 28 (1-2) ◽

pp. 30-55

Author(s):

L. Venter

Keyword(s):

Hate Speech

Download Full-text

The Roma in Post-Communist Bulgaria: Growing Social Marginalization and State Policies

Journal of Asian Social Science Research ◽

10.15575/jassr.v2i1.1 ◽

2020 ◽

Vol 2 (1) ◽

pp. 1-24

Author(s):

Yorgos Christidis

Keyword(s):

Political Parties ◽

Social Exclusion ◽

Hate Crime ◽

Hate Speech ◽

Poor Performance ◽

State Policies ◽

Social Marginalization ◽

Popular Support ◽

Economic Problems ◽

The Poor

This article analyzes the growing impoverishment and marginalization of the Roma in Bulgarian society and the evolution of Bulgaria’s post-1989 policies towards the Roma. It examines the results of the policies so far and the reasons behind the “poor performance” of the policies implemented. It is believed that Post-communist Bulgaria has successfully re-integrated the ethnic Turkish minority given both the assimilation campaign carried out against it in the 1980s and the tragic events that took place in ex-Yugoslavia in the 1990s. This Bulgaria’s successful “ethnic model”, however, has failed to include the Roma. The “Roma issue” has emerged as one of the most serious and intractable ones facing Bulgaria since 1990. A growing part of its population has been living in circumstances of poverty and marginalization that seem only to deteriorate as years go by. State policies that have been introduced since 1999 have failed at large to produce tangible results and to reverse the socio-economic marginalization of the Roma: discrimination, poverty, and social exclusion continue to be the norm. NGOs point out to the fact that many of the measures that have been announced have not been properly implemented, and that legislation existing to tackle discrimination, hate crime, and hate speech is not implemented. Bulgaria’s political parties are averse in dealing with the Roma issue. Policies addressing the socio-economic problems of the Roma, including hate speech and crime, do not enjoy popular support and are seen as politically damaging.

Download Full-text