Hate Speech Classification in Indonesian Language Tweets by Using Convolutional Neural Network

The rapid development of social media, added with the freedom of social media users to express their opinions, has influenced the spread of hate speech aimed at certain groups. Online based hate speech can be identified by the used of derogatory words in social media posts. Various studies on hate speech classification have been done, however, very few researches have been conducted on hate speech classification in the Indonesian language. This paper proposes a convolutional neural network method for classifying hate speech in tweets in the Indonesian language. Datasets for both the training and testing stages were collected from Twitter. The collected tweets were categorized into hate speech and non-hate speech. We used TF-IDF as the term weighting method for feature extraction. The most optimal training accuracy and validation accuracy gained were 90.85% and 88.34% at 45 epochs. For the testing stage, experiments were conducted with different amounts of testing data. The highest testing accuracy was 82.5%, achieved by the dataset with 50 tweets in each category.

Download Full-text

Study of Undersampling Method: Instance Hardness Threshold with Various Estimators for Hate Speech Classification

IJITEE (International Journal of Information Technology and Electrical Engineering) ◽

10.22146/ijitee.42152 ◽

2018 ◽

Vol 2 (2) ◽

Cited By ~ 3

Author(s):

Naufal Azmi Verdikha ◽

Teguh Bharata Adji ◽

Adhistya Erna Permanasari

Keyword(s):

Social Media ◽

Hate Speech ◽

Imbalanced Data ◽

Poor Performance ◽

Training Data ◽

Weighting Method ◽

Imbalanced Data Classification ◽

Data Problem ◽

Speech Classification ◽

Instance Hardness

A text classification system is needed to address the problem of hate speech in social media. However, texts of hate speech are very hard to find in social media. This will make the distribution of training data to be unbalanced (imbalanced data). Classification with imbalanced data will make a poor performance. There are several methods to solve the problem of classification with imbalanced data. One of them is undersampling with Instance Hardness Threshold (IHT) method. IHT method balances the dataset by eliminating data that are frequently misclassified. To find those data, IHT requires an estimator, which is a classifier. This research aims to compare estimators of IHT method to solve imbalanced data problem in hate speech classification using TF-IDF weighting method. This research uses the class ratio of dataset after undersampling, time of the undersampling process, and Index of Balanced Accuracy (IBA) evaluation to determine the best IHT method. The results of this research show that IHT method using the Logistic Regression (IHT(LR)) has the fastest undersampling process (1.91 s), perfectly balance dataset with the class ratio is 1:1, and has the best of IBA evaluation in all estimation process. This result makes IHT(LR) be the best method to solve the imbalanced data problem in hate speech classification.

Download Full-text

Multi-channel Convolutional Neural Network for Hate Speech Detection in Social Media

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Advances of Science and Technology ◽

10.1007/978-3-030-93709-6_41 ◽

2022 ◽

pp. 603-618

Author(s):

Zeleke Abebaw ◽

Andreas Rauber ◽

Solomon Atnafu

Keyword(s):

Neural Network ◽

Social Media ◽

Convolutional Neural Network ◽

Hate Speech ◽

Speech Detection

Download Full-text

Comparison of cephalometric measurements between conventional and automatic cephalometric analysis using convolutional neural network

Progress in Orthodontics ◽

10.1186/s40510-021-00358-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Sangmin Jeon ◽

Kyungmin Clara Lee

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Rapid Development ◽

Cephalometric Analysis ◽

Anatomical Landmarks ◽

Limits Of Agreement ◽

Mandibular Incisor ◽

Lateral Cephalograms ◽

Dental Measurements ◽

Cephalometric Measurements

Abstract Objective The rapid development of artificial intelligence technologies for medical imaging has recently enabled automatic identification of anatomical landmarks on radiographs. The purpose of this study was to compare the results of an automatic cephalometric analysis using convolutional neural network with those obtained by a conventional cephalometric approach. Material and methods Cephalometric measurements of lateral cephalograms from 35 patients were obtained using an automatic program and a conventional program. Fifteen skeletal cephalometric measurements, nine dental cephalometric measurements, and two soft tissue cephalometric measurements obtained by the two methods were compared using paired t test and Bland-Altman plots. Results A comparison between the measurements from the automatic and conventional cephalometric analyses in terms of the paired t test confirmed that the saddle angle, linear measurements of maxillary incisor to NA line, and mandibular incisor to NB line showed statistically significant differences. All measurements were within the limits of agreement based on the Bland-Altman plots. The widths of limits of agreement were wider in dental measurements than those in the skeletal measurements. Conclusions Automatic cephalometric analyses based on convolutional neural network may offer clinically acceptable diagnostic performance. Careful consideration and additional manual adjustment are needed for dental measurements regarding tooth structures for higher accuracy and better performance.

Download Full-text

Diabetic Retinal Grading Using Attention-Based Bilinear Convolutional Neural Network and Complement Cross Entropy

Entropy ◽

10.3390/e23070816 ◽

2021 ◽

Vol 23 (7) ◽

pp. 816

Author(s):

Pingping Liu ◽

Xiaokang Yang ◽

Baixin Jin ◽

Qiuzhan Zhou

Keyword(s):

Neural Network ◽

Image Processing ◽

Diabetic Retinopathy ◽

Convolutional Neural Network ◽

Image Classification ◽

Network Model ◽

Rapid Development ◽

Image Data ◽

Lesion Detection ◽

Great Success

Diabetic retinopathy (DR) is a common complication of diabetes mellitus (DM), and it is necessary to diagnose DR in the early stages of treatment. With the rapid development of convolutional neural networks in the field of image processing, deep learning methods have achieved great success in the field of medical image processing. Various medical lesion detection systems have been proposed to detect fundus lesions. At present, in the image classification process of diabetic retinopathy, the fine-grained properties of the diseased image are ignored and most of the retinopathy image data sets have serious uneven distribution problems, which limits the ability of the network to predict the classification of lesions to a large extent. We propose a new non-homologous bilinear pooling convolutional neural network model and combine it with the attention mechanism to further improve the network’s ability to extract specific features of the image. The experimental results show that, compared with the most popular fundus image classification models, the network model we proposed can greatly improve the prediction accuracy of the network while maintaining computational efficiency.

Download Full-text

An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization

CCF Transactions on High Performance Computing ◽

10.1007/s42514-020-00055-4 ◽

2021 ◽

Author(s):

Dong Wen ◽

Jingfei Jiang ◽

Yong Dou ◽

Jinwei Xu ◽

Tao Xiao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Energy Efficient ◽

Speech Classification

Download Full-text

Spam Detection on Social Media Using Semantic Convolutional Neural Network

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/ijkdb.2018010102 ◽

2018 ◽

Vol 8 (1) ◽

pp. 12-26 ◽

Cited By ~ 16

Author(s):

Gauri Jain ◽

Manisha Sharma ◽

Basant Agarwal

Keyword(s):

Neural Network ◽

Social Media ◽

Convolutional Neural Network ◽

State Of The Art ◽

Spam Detection ◽

Learning Technology ◽

The Social ◽

Social Media Text ◽

Current Article ◽

Semantic Layer

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.

Download Full-text

Hate Speech Classification in Social Media Using Emotional Analysis

2018 7th Brazilian Conference on Intelligent Systems (BRACIS) ◽

10.1109/bracis.2018.00019 ◽

2018 ◽

Cited By ~ 6

Author(s):

Ricardo Martins ◽

Marco Gomes ◽

Jose Joao Almeida ◽

Paulo Novais ◽

Pedro Henriques

Keyword(s):

Social Media ◽

Hate Speech ◽

Speech Classification

Download Full-text

Improving Testing Accuracy of Convolutional Neural Network for Steganalysis Using Segmented Subimages

Cloud Computing and Security - Lecture Notes in Computer Science ◽

10.1007/978-3-030-00015-8_27 ◽

2018 ◽

pp. 313-323

Author(s):

Yifeng Sun ◽

Xiaoyu Xu ◽

Haitao Song ◽

Guangming Tang ◽

Shunxiang Yang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Testing Accuracy

Download Full-text

Part-of-Speech Classification from Magnetoencephalography Data Using 1-Dimensional Convolutional Neural Network

10.31234/osf.io/6gqj8 ◽

2020 ◽

Author(s):

Alessandro Lopopolo ◽

Antal van den Bosch

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Mental State ◽

Human Subjects ◽

Brain Activity ◽

Neural Decoding ◽

Significant Progress ◽

Part Of Speech ◽

Syntactic Information ◽

Speech Classification

Neural decoding of speech and language refers to the extraction of information regarding the stimulus and the mental state of subjects from recordings of their brain activity while performing linguistic tasks. Recent years have seen significant progress in the decoding of speech from cortical activity. This study instead focuses on decoding linguistic information. We present a deep parallel temporal convolutional neural network (1DCNN) trained on part-of-speech (PoS) classification from magnetoencephalography (MEG) data collected during natural language reading. The network is trained on data from 15 human subjects separately, and yields above-chance accuracies on test data for all of them. The level of PoS was targeted because it offers a clean linguistic benchmark level that represents syntactic information and abstracts away from semantic or conceptual representations.

Download Full-text

Integration of Convolutional Neural Network and Error Correction for Indoor Positioning

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9020074 ◽

2020 ◽

Vol 9 (2) ◽

pp. 74

Author(s):

Eric Hsueh-Chan Lu ◽

Jing-Mei Ciou

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Error Correction ◽

Spatial Information ◽

Rapid Development ◽

Indoor Positioning ◽

Experimental Results ◽

Global Navigation Satellite Systems ◽

Signal Interference ◽

Position Vector

With the rapid development of surveying and spatial information technologies, more and more attention has been given to positioning. In outdoor environments, people can easily obtain positioning services through global navigation satellite systems (GNSS). In indoor environments, the GNSS signal is often lost, while other positioning problems, such as dead reckoning and wireless signals, will face accumulated errors and signal interference. Therefore, this research uses images to realize a positioning service. The main concept of this work is to establish a model for an indoor field image and its coordinate information and to judge its position by image eigenvalue matching. Based on the architecture of PoseNet, the image is input into a 23-layer convolutional neural network according to various sizes to train end-to-end location identification tasks, and the three-dimensional position vector of the camera is regressed. The experimental data are taken from the underground parking lot and the Palace Museum. The preliminary experimental results show that this new method designed by us can effectively improve the accuracy of indoor positioning by about 20% to 30%. In addition, this paper also discusses other architectures, field sizes, camera parameters, and error corrections for this neural network system. The preliminary experimental results show that the angle error correction method designed by us can effectively improve positioning by about 20%.

Download Full-text