Analysis Text of Hate Speech Detection Using Recurrent Neural Network

Automated Amharic Hate Speech Posts and Comments Detection Model Using Recurrent Neural Network

10.21203/rs.3.rs-114533/v1 ◽

2020 ◽

Author(s):

Surafel Getachew Tesfaye ◽

Kula Kakeba

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Performance Test ◽

Hate Speech ◽

Short Term Memory ◽

Detection Methods ◽

Batch Size ◽

Neural Network Models ◽

Data Set ◽

Speech Detection

Abstract During the last few years, social activities over the internet especially on social media platforms increased drastically, but unfortunately, social networks have also become the place for hate speech proliferation by which most people’s social lives are disturbed because of hate speech posts and conflicts triggered by those posts. Studies confirm that online hate speech has different offline consequences. Even though there are a lot of researches on automated hate speech detection most of them are for other language and there is a scarcity of labeled data to apply automated analysis and detection methods on Amharic dataset. Therefore the research on automatic detection of hate speech posts attracted our attention. As a solution to those problems, this research aimed to prepare a labeled huge Amharic dataset by collecting posts and comments from selected Facebook pages of activists that participated actively. Those Facebook data sets are labeled manually as hate and free based on the guidelines given from researcher and pre-processed by applying data cleaning and normalization techniques. In this research the recurrent neural network models for automated hate speech posts detection from Amharic posts on Facebook is developed by using Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) with word n-grams for feature extraction and word2vec to represent each unique word by vector representation. The experiment conducted on those two models by using 80% of the data set for training and 10% for validation to train the model and to select the best hyper-parameters combination for automated hate speech posts detection. The remaining 10% of the dataset used for testing the model after training. As a result LSTM based RNN of Batch size 128, and learning rate 0.001 with RMSProp optimizer and 0.5 dropout achieves an accuracy of 97.9% to detect posts as hate speech or free by training with 100 epochs. Which is assured by testing the models using models performance test and inference on user-generated data.

Download Full-text

Text Analysis For Hate Speech Detection Using Backpropagation Neural Network

2018 International Conference on Control, Electronics, Renewable Energy and Communications (ICCEREC) ◽

10.1109/iccerec.2018.8712109 ◽

2018 ◽

Cited By ~ 3

Author(s):

Nabiila Adani Setyadi ◽

Muhammad Nasrun ◽

Casi Setianingsih

Keyword(s):

Neural Network ◽

Text Analysis ◽

Hate Speech ◽

Backpropagation Neural Network ◽

Speech Detection

Download Full-text

Neural Network Applications in Hate Speech Detection

Advances in Computer and Electrical Engineering - Neural Networks for Natural Language Processing ◽

10.4018/978-1-7998-1159-6.ch012 ◽

2020 ◽

pp. 188-204

Author(s):

Brian Tuan Khieu ◽

Melody Moh

Keyword(s):

Neural Network ◽

Social Media ◽

Hate Speech ◽

Speech Detection ◽

Network Applications ◽

Current State ◽

New Directions ◽

Key Techniques ◽

Neural Network Applications ◽

Positive Results

This chapter presents a literature survey of the current state of hate speech detection models with a focus on neural network applications in the area. The growth and freedom of social media has facilitated the dissemination of positive and negative ideas. Proponents of hate speech are one of the key abusers of the privileges allotted by social media, and the companies behind these networks have a vested interest in identifying such speech. Manual moderation is too cumbersome and slow to deal with the torrent of content generation on these social media sites, which is why many have turned to machine learning. Neural network applications in this area have been very promising and yielded positive results. However, there are newly discovered and unaddressed problems with the current state of hate speech detection. Authors' survey identifies the key techniques and methods used in identifying hate speech, and they discuss promising new directions for the field as well as newly identified issues.

Download Full-text

Comparison Between Traditional Machine Learning Models And Neural Network Models For Vietnamese Hate Speech Detection

2020 RIVF International Conference on Computing and Communication Technologies (RIVF) ◽

10.1109/rivf48685.2020.9140745 ◽

2020 ◽

Cited By ~ 2

Author(s):

Son T. Luu ◽

Hung P. Nguyen ◽

Kiet Van Nguyen ◽

Ngan Luu-Thuy Nguyen

Keyword(s):

Neural Network ◽

Machine Learning ◽

Hate Speech ◽

Network Models ◽

Learning Models ◽

Neural Network Models ◽

Speech Detection ◽

Machine Learning Models

Download Full-text

Hate speech detection in Twitter using hybrid embeddings and improved cuckoo search-based neural networks

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-06-2020-0061 ◽

2020 ◽

Vol 13 (4) ◽

pp. 485-525

Author(s):

Femi Emmanuel Ayo ◽

Olusegun Folorunso ◽

Friday Thomas Ibharalu ◽

Idowu Ademola Osinuga

Keyword(s):

Neural Network ◽

Neural Networks ◽

Feature Extraction ◽

Hate Speech ◽

Short Term Memory ◽

Cuckoo Search ◽

Research Attention ◽

Content Type ◽

Speech Detection ◽

Sentence Level

PurposeHate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with social media data has witnessed special research attention in recent studies, hence, the need to design a generic metadata architecture and efficient feature extraction technique to enhance hate speech detection.Design/methodology/approachThis study proposes a hybrid embeddings enhanced with a topic inference method and an improved cuckoo search neural network for hate speech detection in Twitter data. The proposed method uses a hybrid embeddings technique that includes Term Frequency-Inverse Document Frequency (TF-IDF) for word-level feature extraction and Long Short Term Memory (LSTM) which is a variant of recurrent neural networks architecture for sentence-level feature extraction. The extracted features from the hybrid embeddings then serve as input into the improved cuckoo search neural network for the prediction of a tweet as hate speech, offensive language or neither.FindingsThe proposed method showed better results when tested on the collected Twitter datasets compared to other related methods. In order to validate the performances of the proposed method, t-test and post hoc multiple comparisons were used to compare the significance and means of the proposed method with other related methods for hate speech detection. Furthermore, Paired Sample t-Test was also conducted to validate the performances of the proposed method with other related methods.Research limitations/implicationsFinally, the evaluation results showed that the proposed method outperforms other related methods with mean F1-score of 91.3.Originality/valueThe main novelty of this study is the use of an automatic topic spotting measure based on naïve Bayes model to improve features representation.

Download Full-text

Three-Class Overlapped Speech Detection Using a Convolutional Recurrent Neural Network

10.21437/interspeech.2021-149 ◽

2021 ◽

Author(s):

Jee-weon Jung ◽

Hee-Soo Heo ◽

Youngki Kwon ◽

Joon Son Chung ◽

Bong-Jin Lee

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Speech Detection

Download Full-text

Hate Speech Identification in Text Written in Indonesian with Recurrent Neural Network

2019 International Conference on Advanced Computer Science and information Systems (ICACSIS) ◽

10.1109/icacsis47736.2019.8979959 ◽

2019 ◽

Author(s):

Erryan Sazany ◽

Indra Budi

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Hate Speech ◽

Speech Identification

Download Full-text

A Deep Learning Approach for Automatic Hate Speech Detection in the Saudi Twittersphere

Applied Sciences ◽

10.3390/app10238614 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8614 ◽

Cited By ~ 1

Author(s):

Raghad Alshalan ◽

Hend Al-Khalifa

Keyword(s):

Neural Network ◽

Hate Speech ◽

Characteristic Curve ◽

Network Models ◽

Neural Network Models ◽

Speech Detection ◽

Language Representation ◽

Significant Research ◽

Speech Problem ◽

Gated Recurrent Units

With the rise of hate speech phenomena in the Twittersphere, significant research efforts have been undertaken in order to provide automatic solutions for detecting hate speech, varying from simple machine learning models to more complex deep neural network models. Despite this, research works investigating hate speech problem in Arabic are still limited. This paper, therefore, aimed to investigate several neural network models based on convolutional neural network (CNN) and recurrent neural network (RNN) to detect hate speech in Arabic tweets. It also evaluated the recent language representation model bidirectional encoder representations from transformers (BERT) on the task of Arabic hate speech detection. To conduct our experiments, we firstly built a new hate speech dataset that contained 9316 annotated tweets. Then, we conducted a set of experiments on two datasets to evaluate four models: CNN, gated recurrent units (GRU), CNN + GRU, and BERT. Our experimental results in our dataset and an out-domain dataset showed that the CNN model gave the best performance, with an F1-score of 0.79 and area under the receiver operating characteristic curve (AUROC) of 0.89.

Download Full-text