Classification of Abusive Thai Language Content in Social Media Using Deep Learning

Author(s):  
Ruangsung Wanasukapunt ◽  
Suphakant Phimoltares
2019 ◽  
Vol 11 (01n02) ◽  
pp. 1950002
Author(s):  
Rasim M. Alguliyev ◽  
Ramiz M. Aliguliyev ◽  
Fargana J. Abdullayeva

Recently, data collected from social media enable to analyze social events and make predictions about real events, based on the analysis of sentiments and opinions of users. Most cyber-attacks are carried out by hackers on the basis of discussions on social media. This paper proposes the method that predicts DDoS attacks occurrence by finding relevant texts in social media. To perform high-precision classification of texts to positive and negative classes, the CNN model with 13 layers and improved LSTM method are used. In order to predict the occurrence of the DDoS attacks in the next day, the negative and positive sentiments in social networking texts are used. To evaluate the efficiency of the proposed method experiments were conducted on Twitter data. The proposed method achieved a recall, precision, [Formula: see text]-measure, training loss, training accuracy, testing loss, and test accuracy of 0.85, 0.89, 0.87, 0.09, 0.78, 0.13, and 0.77, respectively.


2021 ◽  
Vol 10 (2) ◽  
pp. 1065-1069
Author(s):  
H. Park ◽  
G. Moon ◽  
K. Kim

Coronavirus disease (COVID-19) is a significant disaster worldwide from December 2019 to the present. Information on the COVID-19 is grasped through news media or social media, and researchers are conducting various research. This is because we are trying to shorten the time to be aware of the COVID-19 disaster situation. In this paper, we build a chatbot so that it can be used in emergencies using the COVID-19 data set and investigate how the analysis is changing the situation with deep learning.


Author(s):  
Rafly Indra Kurnia ◽  
◽  
Abba Suganda Girsang

This study will classify the text based on the rating of the provider application on the Google Play Store. This research is classification of user comments using Word2vec and the deep learning algorithm in this case is Long Short Term Memory (LSTM) based on the rating given with a rating scale of 1-5 with a detailed rating 1 is the lowest and rating 5 is the highest data and a rating scale of 1-3 with a detailed rating, 1 as a negative is a combination of ratings 1 and 2, rating 2 as a neutral is rating 3, and rating 3 as a positive is a combination of ratings 4 and 5 to get sentiment from users using SMOTE oversampling to handle the imbalance data. The data used are 16369 data. The training data and the testing data will be taken from user comments MyTelkomsel’s application from the play.google.com site where each comment has a rating in Indonesian Language. This review data will be very useful for companies to make business decisions. This data can be obtained from social media, but social media does not provide a rating feature for every user comment. This research goal is that data from social media such as Twitter or Facebook can also quickly find out the total of the user satisfaction based from the rating from the comment given. The best f1 scores and precisions obtained using 5 classes with LSTM and SMOTE were 0.62 and 0.70 and the best f1 scores and precisions obtained using 3 classes with LSTM and SMOTE were 0.86 and 0.87


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Junaid Asghar ◽  
Saima Akbar ◽  
Muhammad Zubair Asghar ◽  
Bashir Ahmad ◽  
Mabrook S. Al-Rakhami ◽  
...  

Nowadays, there is a digital era, where social media sites like Facebook, Google, Twitter, and YouTube are used by the majority of people, generating a lot of textual content. The user-generated textual content discloses important information about people’s personalities, identifying a special type of people known as psychopaths. The aim of this work is to classify the input text into psychopath and nonpsychopath traits. Most of the existing work on psychopath’s detection has been performed in the psychology domain using traditional approaches, like SRPIII technique with limited dataset size. Therefore, it motivates us to build an advanced computational model for psychopath’s detection in the text analytics domain. In this work, we investigate an advanced deep learning technique, namely, attention-based BILSTM for psychopath’s detection with an increased dataset size for efficient classification of the input text into psychopath vs. nonpsychopath classes.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Carlos Abel Córdova Sáenz ◽  
Marcelo Dias ◽  
Karin Becker

Fake news (FN) have affected people’s lives in unimaginable ways. The automatic classification of FN is a vital tool to prevent their dissemination and support fact-checking. Related work has shown that FN spread faster, deeper, and more broadly than truthful news on social media. Deep learning has produced state-of-the-art solutions in this field, mainly based on textual attributes. In this paper, we propose to combine compact representations of the textual news properties generated using DistilBERT, with topological metrics extracted from their propagation network in social media. Using a dataset related to politics and distinct learning algorithms, we extensively assessed the components of the proposed solution. Regarding the textual attributes, we reached results comparable to stateof-the-art solutions using only the news title and contents, which is useful for FN early detection. We assessed the influential topological metrics, and the effect of their combination with the news textual features. We also explored the use of ensembles. Our results were very promising, revealing the potential of the features proposed and the adoption of ensembles.


Author(s):  
А.С. Бобин

При решении задач классификации с использование глубокого обучения сталкиваются с проблемой сходимости модели. Такая проблема возникает из за ограниченного объема данных в выборках. When solving classification problems using deep learning, they face the problem of model convergence. This problem occurs due to the limited amount of data in the samples.


2021 ◽  
Author(s):  
Ana Sofia Cardoso ◽  
Francesco Renna ◽  
Domingo Alcaraz-Segura ◽  
Ana Sofia Vaz

Crowdsourced social media data has become popular in the assessment of cultural ecosystem services (CES). Advances in deep learning show great potential for the timely assessment of CES at large scales. Here, we describe a procedure for automating the assessment of image elements pertaining to CES from social media. We focus on a binary (natural, human) and a multiclass (posing, species, nature, landscape, human activities, human structures) classification of those elements using two Convolutional Neural Networks (CNNs; VGG16 and ResNet152) with the weights from two large datasets - Places365 and ImageNet -, and our own dataset. We train those CNNs over Flickr and Wikiloc images from the Peneda-Geres region (Portugal) and evaluate their transferability to wider areas, using Sierra Nevada (Spain) as test. CNNs trained for Peneda-Geres performed well, with results for the binary classification (F1-score > 80%) exceeding those for the multiclass classification (> 60%). CNNs pre-trained with Places365 and ImageNet data performed significantly better than with our data. Model performance decreased when transferred to Sierra Nevada, but their performances were satisfactory (> 60%). The combination of manual annotations, freely available CNNs and pre-trained local datasets thereby show great relevance to support automated CES assessments from social media.


Sign in / Sign up

Export Citation Format

Share Document