scholarly journals Effective hate-speech detection in Twitter data using recurrent neural networks

2018 ◽  
Vol 48 (12) ◽  
pp. 4730-4742 ◽  
Author(s):  
Georgios K. Pitsilis ◽  
Heri Ramampiaro ◽  
Helge Langseth
2020 ◽  
Vol 13 (4) ◽  
pp. 485-525
Author(s):  
Femi Emmanuel Ayo ◽  
Olusegun Folorunso ◽  
Friday Thomas Ibharalu ◽  
Idowu Ademola Osinuga

PurposeHate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with social media data has witnessed special research attention in recent studies, hence, the need to design a generic metadata architecture and efficient feature extraction technique to enhance hate speech detection.Design/methodology/approachThis study proposes a hybrid embeddings enhanced with a topic inference method and an improved cuckoo search neural network for hate speech detection in Twitter data. The proposed method uses a hybrid embeddings technique that includes Term Frequency-Inverse Document Frequency (TF-IDF) for word-level feature extraction and Long Short Term Memory (LSTM) which is a variant of recurrent neural networks architecture for sentence-level feature extraction. The extracted features from the hybrid embeddings then serve as input into the improved cuckoo search neural network for the prediction of a tweet as hate speech, offensive language or neither.FindingsThe proposed method showed better results when tested on the collected Twitter datasets compared to other related methods. In order to validate the performances of the proposed method, t-test and post hoc multiple comparisons were used to compare the significance and means of the proposed method with other related methods for hate speech detection. Furthermore, Paired Sample t-Test was also conducted to validate the performances of the proposed method with other related methods.Research limitations/implicationsFinally, the evaluation results showed that the proposed method outperforms other related methods with mean F1-score of 91.3.Originality/valueThe main novelty of this study is the use of an automatic topic spotting measure based on naïve Bayes model to improve features representation.


2020 ◽  
Vol 34 (1) ◽  
pp. 81-88
Author(s):  
Aya Elouali ◽  
Zakaria Elberrichi ◽  
Nadia Elouali

2019 ◽  
Author(s):  
Gustavo Henrique Paetzold ◽  
Marcos Zampieri ◽  
Shervin Malmasi

Author(s):  
Muhammad Moin Khan ◽  
Khurram Shahzad ◽  
Muhammad Kamran Malik

Hate speech is a specific type of controversial content that is widely legislated as a crime that must be identified and blocked. However, due to the sheer volume and velocity of the Twitter data stream, hate speech detection cannot be performed manually. To address this issue, several studies have been conducted for hate speech detection in European languages, whereas little attention has been paid to low-resource South Asian languages, making the social media vulnerable for millions of users. In particular, to the best of our knowledge, no study has been conducted for hate speech detection in Roman Urdu text, which is widely used in the sub-continent. In this study, we have scrapped more than 90,000 tweets and manually parsed them to identify 5,000 Roman Urdu tweets. Subsequently, we have employed an iterative approach to develop guidelines and used them for generating the Hate Speech Roman Urdu 2020 corpus. The tweets in the this corpus are classified at three levels: Neutral-Hostile, Simple-Complex, and Offensive-Hate speech. As another contribution, we have used five supervised learning techniques, including a deep learning technique, to evaluate and compare their effectiveness for hate speech detection. The results show that Logistic Regression outperformed all other techniques, including deep learning techniques for the two levels of classification, by achieved an F1 score of 0.906 for distinguishing between Neutral-Hostile tweets, and 0.756 for distinguishing between Offensive-Hate speech tweets.


Sign in / Sign up

Export Citation Format

Share Document