Hate Speech Detection on Twitter Using Long Short-Term Memory (LSTM) Method

Currently, the discussion about hate speech in Indonesia is warm, primarily through social media. Hate speech is communication that disparages a person or group based on characteristics such as (race, ethnicity, gender, citizenship, religion and organization). Twitter is one of the social media that someone uses to express their feelings and opinions through tweets, including tweets that contain expressions of hatred because Twitter has a significant influence on the success or destruction of one's image.This study aims to detect hate speech or not hate Indonesian speech tweets by using the Bidirectional Long Short Term Memory method and the word2vec feature extraction method with Continuous bag-of-word (CBOW) architecture. For testing the BiLSTM purpose with the calculation of the value of accuracy, precision, recall, and F-measure.The use of word2vec and the Bidirectional Long Short Term Memory method with CBOW architecture, with epoch 10, learning rate 0.001 and the number of neurons 200 on the hidden layer, produce an accuracy rate of 94.66%, with each precision value of 99.08%, recall 93, 74% and F-measure 96.29%. In contrast, the Bidirectional Long Short Term Memory with three layers has an accuracy of 96.93%. The addition of one layer to BiLSTM increased by 2.27%.

Download Full-text

Long Short-Term Memory Model for Classification of English-PtBR Cross-Lingual Hate Speech

Journal of Computer Science ◽

10.3844/jcssp.2019.1546.1571 ◽

2019 ◽

Vol 15 (10) ◽

pp. 1546-1571

Author(s):

Thiago D. Bispo ◽

Hendrik T. Macedo ◽

Fl�vio de O. Santos ◽

Rafael P. da Silva ◽

Leonardo N. Matos ◽

...

Keyword(s):

Hate Speech ◽

Short Term Memory ◽

Memory Model ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Cross Lingual

Download Full-text

Comparative Analysis of Deep Learning Techniques for the Classification of Hate Speech

NIGERIAN ANNALS OF PURE AND APPLIED SCIENCES ◽

10.46912/napas.227 ◽

2021 ◽

Vol 4 (1) ◽

pp. 121-128

Author(s):

A Iorliam ◽

S Agber ◽

MP Dzungwe ◽

DK Kwaghtyo ◽

S Bum

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Hate Speech ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Learning Techniques ◽

Or Groups ◽

Long Short Term Memory

Social media provides opportunities for individuals to anonymously communicate and express hateful feelings and opinions at the comfort of their rooms. This anonymity has become a shield for many individuals or groups who use social media to express deep hatred for other individuals or groups, tribes or race, religion, gender, as well as belief systems. In this study, a comparative analysis is performed using Long Short-Term Memory and Convolutional Neural Network deep learning techniques for Hate Speech classification. This analysis demonstrates that the Long Short-Term Memory classifier achieved an accuracy of 92.47%, while the Convolutional Neural Network classifier achieved an accuracy of 92.74%. These results showed that deep learning techniques can effectively classify hate speech from normal speech.

Download Full-text

Detecting hate speech against politicians in Arabic community on social media

International Journal of Web Information Systems ◽

10.1108/ijwis-08-2019-0036 ◽

2020 ◽

Vol 16 (3) ◽

pp. 295-313

Author(s):

Imane Guellil ◽

Ahsan Adeel ◽

Faical Azouaou ◽

Sara Chennoufi ◽

Hanene Maafi ◽

...

Keyword(s):

Social Media ◽

Deep Learning ◽

Hate Speech ◽

Short Term Memory ◽

Arabic Language ◽

Short Term ◽

Speech Corpus ◽

Term Memory ◽

Content Type ◽

Speech Detection

Purpose This paper aims to propose an approach for hate speech detection against politicians in Arabic community on social media (e.g. Youtube). In the literature, similar works have been presented for other languages such as English. However, to the best of the authors’ knowledge, not much work has been conducted in the Arabic language. Design/methodology/approach This approach uses both classical algorithms of classification and deep learning algorithms. For the classical algorithms, the authors use Gaussian NB (GNB), Logistic Regression (LR), Random Forest (RF), SGD Classifier (SGD) and Linear SVC (LSVC). For the deep learning classification, four different algorithms (convolutional neural network (CNN), multilayer perceptron (MLP), long- or short-term memory (LSTM) and bi-directional long- or short-term memory (Bi-LSTM) are applied. For extracting features, the authors use both Word2vec and FastText with their two implementations, namely, Skip Gram (SG) and Continuous Bag of Word (CBOW). Findings Simulation results demonstrate the best performance of LSVC, BiLSTM and MLP achieving an accuracy up to 91%, when it is associated to SG model. The results are also shown that the classification that has been done on balanced corpus are more accurate than those done on unbalanced corpus. Originality/value The principal originality of this paper is to construct a new hate speech corpus (Arabic_fr_en) which was annotated by three different annotators. This corpus contains the three languages used by Arabic people being Arabic, French and English. For Arabic, the corpus contains both script Arabic and Arabizi (i.e. Arabic words written with Latin letters). Another originality is to rely on both shallow and deep leaning classification by using different model for extraction features such as Word2vec and FastText with their two implementation SG and CBOW.

Download Full-text

Long Short-Term Memory for Hate Speech and Abusive Language Detection on Indonesian Youtube Comment Section

Proceedings of 2021 the 11th International Workshop on Computer Science and Engineering ◽

10.18178/wcse.2021.06.029 ◽

2021 ◽

Keyword(s):

Hate Speech ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Language Detection

Download Full-text

Implementasi Algoritma Long Short-Term Memory (LSTM) Untuk Mendeteksi Ujaran Kebencian (Hate Speech) Pada Kasus Pilpres 2019

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer ◽

10.30812/matrik.v19i1.495 ◽

2019 ◽

Vol 19 (1) ◽

pp. 37-44

Author(s):

Aini Suri Talita ◽

Aristiawan Wiguna

Keyword(s):

Neural Network ◽

Social Media ◽

Hate Speech ◽

Short Term Memory ◽

Model Performance ◽

Literature Study ◽

Short Term ◽

Term Memory ◽

Testing Data ◽

Long Short Term Memory

Researches involving Artificial Neural Network (ANN) or its derivative have been published all around the world, spesifically to solve data mining problem, classification, clusterinf, or detection problems. Recurrent Neural Network is a class of ANN with Long Short Term Memory (LSTM) as its one of the architecture that commonly used in deep learning problems. On this paper, we use LSTM to detect hate speech on social media related with Indonesia President Election on 2019. There are several steps on this research, we start with literature study, data collection, data preprocessing, training step, and testing step. The dataset consist of 950 sentences, while the testing data consist of 190 comments on Facebook. The best model performance was reached with recall value 0.7021, which menas that from the whole relevant instances on the testing data, 70.21% were categorized as relevant, on this case as hate speech (HS). The other performance parameter value as in accuracy and precision still quite low due to the testing data that comes directly from social media which highly possible consist of inconsistent choises of words, informal words, or contains grammatically error sentences.

Download Full-text