offensive language Latest Research Papers

2022 ◽

Vol 21 (1) ◽

pp. 1-13

Author(s):

Tharindu Ranasinghe ◽

Marcos Zampieri

Keyword(s):

Transfer Learning ◽

Hate Speech ◽

Training And Development ◽

Language Identification ◽

Shared Task ◽

Low Resource ◽

Government Organizations ◽

Cross Lingual ◽

Offensive Language ◽

Clear Majority

Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g., hate speech, cyberbullying, and cyberaggression). The clear majority of these studies deal with English partially because most annotated datasets available contain English data. In this article, we take advantage of available English datasets by applying cross-lingual contextual word embeddings and transfer learning to make predictions in low-resource languages. We project predictions on comparable data in Arabic, Bengali, Danish, Greek, Hindi, Spanish, and Turkish. We report results of 0.8415 F1 macro for Bengali in TRAC-2 shared task [23], 0.8532 F1 macro for Danish and 0.8701 F1 macro for Greek in OffensEval 2020 [58], 0.8568 F1 macro for Hindi in HASOC 2019 shared task [27], and 0.7513 F1 macro for Spanish in in SemEval-2019 Task 5 (HatEval) [7], showing that our approach compares favorably to the best systems submitted to recent shared tasks on these three languages. Additionally, we report competitive performance on Arabic and Turkish using the training and development sets of OffensEval 2020 shared task. The results for all languages confirm the robustness of cross-lingual contextual embeddings and transfer learning for this task.

Download Full-text

Offensive-Language Detection on Multi-Semantic Fusion Based on Data Augmentation

Applied System Innovation ◽

10.3390/asi5010009 ◽

2022 ◽

Vol 5 (1) ◽

pp. 9

Author(s):

Junjie Liu ◽

Yong Yang ◽

Xiaochao Fan ◽

Ge Ren ◽

Liang Yang ◽

...

Keyword(s):

Data Augmentation ◽

Rapid Identification ◽

Semantic Features ◽

Fusion Model ◽

Viral Spread ◽

Word Level ◽

Back Translation ◽

Public Datasets ◽

The Impact ◽

Offensive Language

The rapid identification of offensive language in social media is of great significance for preventing viral spread and reducing the spread of malicious information, such as cyberbullying and content related to self-harm. In existing research, the public datasets of offensive language are small; the label quality is uneven; and the performance of the pre-trained models is not satisfactory. To overcome these problems, we proposed a multi-semantic fusion model based on data augmentation (MSF). Data augmentation was carried out by back translation so that it reduced the impact of too-small datasets on performance. At the same time, we used a novel fusion mechanism that combines word-level semantic features and n-grams character features. The experimental results on the two datasets showed that the model proposed in this study can effectively extract the semantic information of offensive language and achieve state-of-the-art performance on both datasets.

Download Full-text

Using an Unsupervised Neural Network to Detect and Categorize Offensive Language in Social Media

IFMBE Proceedings - 7th International Conference on Advancements of Medicine and Health Care through Technology ◽

10.1007/978-3-030-93564-1_48 ◽

2022 ◽

pp. 432-441

Author(s):

Emil Stefan Chifu ◽

Viorica Rozina Chifu ◽

Ana-Maria Costea

Keyword(s):

Neural Network ◽

Social Media ◽

Unsupervised Neural Network ◽

Offensive Language

Download Full-text

Language Matters: How Baker Library is Working Towards Conscious and Inclusive Description

Ticker: The Academic Business Librarianship Review ◽

10.3998/ticker.1943 ◽

2021 ◽

Vol 6 (2) ◽

Author(s):

Christine Riggle ◽

Mary Samouelian

Keyword(s):

Practice Guidelines ◽

Best Practice ◽

Action Plan ◽

Project Team ◽

Racial Equality ◽

Best Practice Guidelines ◽

Marginalized Groups ◽

Counter Narrative ◽

Working Together ◽

Offensive Language

Inclusive and conscious archival description can support consistency in researching and describing marginalized groups and can serve to provide context and a counter-narrative reflecting the perspective of the documented community. It can also help to address the power imbalances between creators and subjects of records. In this article, the authors describe efforts to prepare best practice guidelines for inclusive description and for revising descriptions to remediate outdated, problematic, or offensive language and meet modern standards. They also share how the project team is working together to create meaningful and enduring changes that both provide a better experience for staff and users and support Harvard Business School’s Action Plan for Racial Equality.

Download Full-text

Detection of Offensive Language in Social Networks Using LSTM and BERT Model

10.1109/iccca52192.2021.9666342 ◽

2021 ◽

Author(s):

Ashwini Kumar ◽

Vishu Tyagi ◽

Sanjoy Das

Keyword(s):

Social Networks ◽

Offensive Language

Download Full-text

Developing pedagogically appropriate language corpora through crowdsourcing and gamification

10.14705/rpnet.2021.54.1352 ◽

2021 ◽

pp. 312-317

Author(s):

Rina Zviel-Girshin ◽

Tanara Zingano Kuhn ◽

Ana R. Luís ◽

Kristina Koppel ◽

Branislava Šandrih Todorović ◽

...

Keyword(s):

Language Teaching ◽

Language Education ◽

Everyday Practice ◽

Academic Interest ◽

Structural Problems ◽

Different Types ◽

Offensive Language

Despite the unquestionable academic interest on corpus-based approaches to language education, the use of corpora by teachers in their everyday practice is still not very widespread. One way to promote usage of corpora in language teaching is by making pedagogically appropriate corpora, labelled with different types of problems (for instance, sensitive content, offensive language, structural problems), so that teachers can select authentic examples according to their needs. Because manually labelling corpora is extremely time-consuming, we propose to use crowdsourcing for this task. After a first exploratory phase, we are currently developing a multimode, multilanguage game in which players first identify problematic sentences and then classify them.

Download Full-text

TM‐HOL: Topic memory model for detection of hate speech and offensive language

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6754 ◽

2021 ◽

Author(s):

Jing Chen ◽

Kun Ma ◽

Ke Ji ◽

Zhenxiang Chen

Keyword(s):

Hate Speech ◽

Memory Model ◽

Offensive Language

Download Full-text

Towards Offensive Language Identification for Tamil Code-Mixed YouTube Comments and Posts

SN Computer Science ◽

10.1007/s42979-021-00977-y ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Charangan Vasantharajan ◽

Uthayasanker Thayasivam

Keyword(s):

Language Identification ◽

Offensive Language

Download Full-text

Hate Speech and Offensive Language Detection from Social Media

10.1109/icecube53880.2021.9628255 ◽

2021 ◽

Author(s):

Vildan Mercan ◽

Akhtar Jamil ◽

Alaa Ali Hameed ◽

Irfan Ahmed Magsi ◽

Sibghatullah Bazai ◽

...

Keyword(s):

Social Media ◽

Hate Speech ◽

Offensive Language ◽

Language Detection

Download Full-text

Could a Conversational AI Identify Offensive Language?

Information ◽

10.3390/info12100418 ◽

2021 ◽

Vol 12 (10) ◽

pp. 418

Author(s):

Daniela America da Silva ◽

Henrique Duarte Borges Louro ◽

Gildarcio Sousa Goncalves ◽

Johnny Cardoso Marques ◽

Luiz Alberto Vieira Dias ◽

...

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Social Media ◽

Language Processing ◽

Unethical Behavior ◽

Historical Data ◽

Language Models ◽

The Internet ◽

Proof Of Concept ◽

Offensive Language

In recent years, we have seen a wide use of Artificial Intelligence (AI) applications in the Internet and everywhere. Natural Language Processing and Machine Learning are important sub-fields of AI that have made Chatbots and Conversational AI applications possible. Those algorithms are built based on historical data in order to create language models, however historical data could be intrinsically discriminatory. This article investigates whether a Conversational AI could identify offensive language and it will show how large language models often produce quite a bit of unethical behavior because of bias in the historical data. Our low-level proof-of-concept will present the challenges to detect offensive language in social media and it will discuss some steps to propitiate strong results in the detection of offensive language and unethical behavior using a Conversational AI.

Download Full-text

offensive language
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Multilingual Offensive Language Identification for Low-resource Languages

Offensive-Language Detection on Multi-Semantic Fusion Based on Data Augmentation

Using an Unsupervised Neural Network to Detect and Categorize Offensive Language in Social Media

Language Matters: How Baker Library is Working Towards Conscious and Inclusive Description

Detection of Offensive Language in Social Networks Using LSTM and BERT Model

Developing pedagogically appropriate language corpora through crowdsourcing and gamification

TM‐HOL: Topic memory model for detection of hate speech and offensive language

Towards Offensive Language Identification for Tamil Code-Mixed YouTube Comments and Posts

Hate Speech and Offensive Language Detection from Social Media

Could a Conversational AI Identify Offensive Language?

Export Citation Format

offensive languageRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Multilingual Offensive Language Identification for Low-resource Languages

Offensive-Language Detection on Multi-Semantic Fusion Based on Data Augmentation

Using an Unsupervised Neural Network to Detect and Categorize Offensive Language in Social Media

Language Matters: How Baker Library is Working Towards Conscious and Inclusive Description

Detection of Offensive Language in Social Networks Using LSTM and BERT Model

Developing pedagogically appropriate language corpora through crowdsourcing and gamification

TM‐HOL: Topic memory model for detection of hate speech and offensive language

Towards Offensive Language Identification for Tamil Code-Mixed YouTube Comments and Posts

Hate Speech and Offensive Language Detection from Social Media

Could a Conversational AI Identify Offensive Language?

offensive language
Recently Published Documents