Sentiment Analysis of Cyberbullying on Instagram User Comments

Muhammad Zidny Naf'an; Alhamda Adisoka Bimantara; Afiatari Larasati; Ezar Mega Risondang; Novanda Alim Setya Nugraha

doi:10.21108/jdsa.2019.2.20

Sentiment Analysis of Cyberbullying on Instagram User Comments

Journal of Data Science and Its Applications ◽

10.21108/jdsa.2019.2.20 ◽

2019 ◽

Vol 2 (1) ◽

pp. 88-98 ◽

Cited By ~ 1

Author(s):

Muhammad Zidny Naf'an ◽

Alhamda Adisoka Bimantara ◽

Afiatari Larasati ◽

Ezar Mega Risondang ◽

Novanda Alim Setya Nugraha

Keyword(s):

Social Media ◽

Feature Extraction ◽

Sentiment Analysis ◽

Cross Validation ◽

Experimental Results ◽

Training Data ◽

Bayes Classifier ◽

User Comments ◽

One Act ◽

Fold Cross Validation

Instagram is a social media for sharing images, photos and videos. Instagram has many active users from various circles. In addition to sharing submissions, Instagram users can also give likes and comments to other users' posts. However, the comment feature is often misused, for example it is used for cyberbullying which includes one act against the law. But until now, Instagram still does not provide a feature to detect cyberbullying. Therefore, this study aims to create a system that can classify comments whether they contain elements of cyberbullying or not. The results of the classification will be used to detect cyberbullying comments. The algorithm used for classification is Naïve Bayes Classifier. Then for each comment will pass the preprocessing and feature extraction stages with the TF-IDF method. For evaluation and testing using the K-Fold Cross Validation method. The experiment is divided into two, namely using stemming and without stemming. The training data used is 455 data. The best experimental results obtained an accuracy of 84% both with stemming, and without stemming.

Download Full-text

PredAmyl-MLP: Prediction of Amyloid Proteins Using Multilayer Perceptron

Computational and Mathematical Methods in Medicine ◽

10.1155/2020/8845133 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Yanjuan Li ◽

Zitong Zhang ◽

Zhixia Teng ◽

Xiaoyan Liu

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Prediction Model ◽

Multilayer Perceptron ◽

Type Ii Diabetes ◽

Cross Validation ◽

Experimental Results ◽

Type Ii ◽

Fold Cross Validation ◽

Better Than

Amyloid is generally an aggregate of insoluble fibrin; its abnormal deposition is the pathogenic mechanism of various diseases, such as Alzheimer’s disease and type II diabetes. Therefore, accurately identifying amyloid is necessary to understand its role in pathology. We proposed a machine learning-based prediction model called PredAmyl-MLP, which consists of the following three steps: feature extraction, feature selection, and classification. In the step of feature extraction, seven feature extraction algorithms and different combinations of them are investigated, and the combination of SVMProt-188D and tripeptide composition (TPC) is selected according to the experimental results. In the step of feature selection, maximum relevant maximum distance (MRMD) and binomial distribution (BD) are, respectively, used to remove the redundant or noise features, and the appropriate features are selected according to the experimental results. In the step of classification, we employed multilayer perceptron (MLP) to train the prediction model. The 10-fold cross-validation results show that the overall accuracy of PredAmyl-MLP reached 91.59%, and the performance was better than the existing methods.

Download Full-text

Extraction Opinion of Social Media in Higher Education Using Sentiment Analysis

bit-Tech ◽

10.32877/bt.v2i1.92 ◽

2019 ◽

Vol 2 (1) ◽

pp. 11-19 ◽

Cited By ~ 1

Author(s):

Thomas Edison Tarigan ◽

Robby C Buwono ◽

Sri Redjeki

Keyword(s):

Higher Education ◽

Social Media ◽

Sentiment Analysis ◽

Real Time ◽

Naive Bayes ◽

Training Data ◽

Test Results ◽

Bayes Classifier ◽

Tertiary Institution ◽

Management Performance

The purpose of this research is to extract social media Twitter opinion on a tertiary institution using sentiment analysis. The results of sentiment analysis will provide input to universities as a form of evaluation of management performance in managing institutions. Sentiment analysis generated using the Naïve Bayes Classifier method which is classified into 4 classes: positive, normal, negative and unknown. This study uses 1000 data tweets used for training data needs. The data is classified manually to determine the sentiment of the tweet. Then 20 tweet data is used for testing. The results of this study produce a system that can classify sentiments automatically with 75% test results for sentiment, some obstacles in processing real-time tweets such as duplicate tweets (spam tweets), Indonesian structures that are quite complex and diverse.

Download Full-text

Hospital Facebook Reviews Analysis Using a Machine Learning Sentiment Analyzer and Quality Classifier

Healthcare ◽

10.3390/healthcare9121679 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1679

Author(s):

Afiq Izzudin A. Rahim ◽

Mohd Ismail Ibrahim ◽

Sook-Ling Chua ◽

Kamarul Imran Musa

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Cross Validation ◽

Healthcare Providers ◽

Hospital Quality ◽

Public Hospitals ◽

Learning System ◽

Support Vector ◽

Fold Cross Validation

While experts have recognised the significance and necessity of social media integration in healthcare, no systematic method has been devised in Malaysia or Southeast Asia to include social media input into the hospital quality improvement process. The goal of this work is to explain how to develop a machine learning system for classifying Facebook reviews of public hospitals in Malaysia by using service quality (SERVQUAL) dimensions and sentiment analysis. We developed a Machine Learning Quality Classifier (MLQC) based on the SERVQUAL model and a Machine Learning Sentiment Analyzer (MLSA) by manually annotated multiple batches of randomly chosen reviews. Logistic regression (LR), naive Bayes (NB), support vector machine (SVM), and other methods were used to train the classifiers. The performance of each classifier was tested using 5-fold cross validation. For topic classification, the average F1-score was between 0.687 and 0.757 for all models. In a 5-fold cross validation of each SERVQUAL dimension and in sentiment analysis, SVM consistently outperformed other methods. The study demonstrates how to use supervised learning to automatically identify SERVQUAL domains and sentiments from patient experiences on a hospital’s Facebook page. Malaysian healthcare providers can gather and assess data on patient care via the use of these content analysis technology to improve hospital quality of care.

Download Full-text

Klasifikasi Berita Kriminal Menggunakan NaÃ¯ve Bayes Classifier (NBC) dengan Pengujian K-Fold Cross Validation

Jurnal Sains dan Informatika ◽

10.34128/jsi.v5i2.177 ◽

2019 ◽

Vol 5 (2) ◽

pp. 108-117

Author(s):

Herfia Rhomadhona ◽

Jaka Permadi

Keyword(s):

Cross Validation ◽

Online Media ◽

Bayes Classifier ◽

Ve Bayes ◽

Fold Cross Validation

Berita kriminalitas merupakan berita yang selalu menjadi trending topik di setiap media massa, khususnya media massa online. Media massa online terlah menyediakan beberapa fasilitas untuk mempermudah masyarakan dalam mencari sebuah berita berdasarkan topik. Media massa online melabeli suatu berita berdasarkan kategorinya. Namun, media massa online tidak memberikan sub kategori pada berita tersebut. Sebagai contoh jika seorang pengguna membuka kategori kriminal, maka yang ditampilkan adalah semua jenis berita kriminal tanpa memberikan informasi yang spesifik dari jenis kriminalitasnya. Permasalahan tersebut dapat diatasi dengan mengklasifikasikan berita kriminalitas berdasarkan subkategori. Penelitian ini menggunakan metode NaÃ¯ve Bayes Classifier (NBC) untuk mengklasifikasi berita berdasarkan sub kategorinya. Adapun subkategori terbagi kedalam 5 kategori yaitu korupsi, narkoba, pencurian, pemerkosaan dan pembunuhan. Penelitian ini bertujuan untuk mengetahui kemampuan NBC dalam mengklasifikasi berita dengan melakukan pengujian menggunakan teknik K-Fold Cross Validation dengan nilai K dari 3 sampai 10. Hasil pengujian menyatakan bahwa NBC memiliki kemampuan dalam klasifikasi berita kriminal dengan nilai precision sebesar 98,53 %, nilai recall sebesar 98,44 % dan nilai accuracy sebesar 99,38 %.

Download Full-text

The Sentiment Analysis Reviewing Indosat Services from Twitter Using the Naive Bayes Classifier

Journal of Applied Computer Science and Technology ◽

10.52158/jacost.v1i2.79 ◽

2020 ◽

Vol 1 (2) ◽

pp. 61-66

Author(s):

Febri Astiko ◽

Achmad Khodar

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Naive Bayes ◽

Learning Model ◽

Naïve Bayes ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Machine Learning Model ◽

Bayes Algorithm

This study aims to design a machine learning model of sentiment analysis on Indosat Ooredoo service reviews on social media twitter using the Naive Bayes algorithm as a classifier of positive and negative labels. This sentiment analysis uses machine learning to get patterns an model that can be used again to predict new data.

Download Full-text

Sentiment Analysis Of Online Lecture Opinions On Twitter Social Media Using Naive Bayes Classifier

10.1109/icomitee53461.2021.9650135 ◽

2021 ◽

Author(s):

Devi Ajeng Damaratih

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Online Lecture

Download Full-text

A sentiment analysis system for social media using machine learning techniques: Social enablement

Digital Scholarship in the Humanities ◽

10.1093/llc/fqy037 ◽

2018 ◽

Vol 34 (3) ◽

pp. 569-581 ◽

Cited By ~ 1

Author(s):

Sujata Rani ◽

Parteek Kumar

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Media Analysis ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Analysis Tool ◽

Data Set ◽

Learning Techniques

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.

Download Full-text

Analisis Sentimen Data Twitter Tentang Pasangan Capres-Cawapres Pemilu 2019 Dengan Metode Lexicon Based Dan Support Vector Machine

Jurnal Ilmiah FIFO ◽

10.22441/fifo.2019.v11i2.004 ◽

2019 ◽

Vol 11 (2) ◽

pp. 144

Author(s):

Danar Wido Seno ◽

Arief Wibowo

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Sentiment Analysis ◽

Vice President ◽

Training Data ◽

Support Vector ◽

New Words ◽

Textual Data ◽

Data Content ◽

Combination Of Methods

Social media writing content growing make a lot of new words that appear on Twitter in the form of words and abbreviations that appear so that sentiment analysis is increasingly difficult to get high accuracy of textual data on Twitter social media. In this study, the authors conducted research on sentiment analysis of the pairs of candidates for President and Vice President of Indonesia in the 2019 Elections. To obtain higher accuracy results and accommodate the problem of textual data development on Twitter, the authors conducted a combination of methods to conduct the sentiment analysis with unsupervised and supervised methods. namely Lexicon Based. This study used Twitter data in October 2018 using the search keywords with the names of each pair of candidates for President and Vice President of the 2019 Elections totaling 800 datasets. From the study with 800 datasets the best accuracy was obtained with a value of 92.5% with 80% training data composition and 20% testing data with a Precision value in each class between 85.7% - 97.2% and Recall value for each class among 78, 2% - 93.5%. With the Lexicon Based method as a labeling dataset, the process of labeling the Support Vector Machine dataset is no longer done manually but is processed by the Lexicon Based method and the dictionary on the lexicon can be added along with the development of data content on Twitter social media.

Download Full-text

Analysis of Social Media Users Sentiments against Omnibus Law Based on Hashtags on Twitter

SISTEMASI ◽

10.32520/stmsi.v11i1.1685 ◽

2022 ◽

Vol 11 (1) ◽

pp. 197

Author(s):

Okta Fanny ◽

Heri Suroyo

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Main Topic ◽

Accuracy Score ◽

Test Results ◽

Bayes Classifier ◽

The Public ◽

Average Accuracy ◽

Positive Sentiment ◽

Negative Sentiment

From the research that has been done, it can be concluded that Sentiment Analysis can be used to know the sentiment of the public, especially Twitter netizens against omnibus law. After the sentiment analysis, it looks neutral artmen with the largest percentage of 55%, then positive sentiment by 35% and negative sentiment by 10%. The results of the analysis showed that the Naïve Bayes Classifier method provides classification test results with accuracy in Hashtag Pro with an average accuracy score of 92.1%, precision values with an average of 94.8% and recall values with an average of 90.7%. While Hashtag Counter For data classification, with an average accuracy value of 98.3%, precision value with an average of 97.6% and recall value with an average of 98.7%. The result of text cloud analysis conducted on a combination of hashtags both Hashtag pros and Hashtags cons, the dominant word appears is Omnibus Law which means that all hashtags in scrap is really discussing the main topic that is about Omnibus Law

Download Full-text

Discovery of Sustainable Transport Modes Underlying TripAdvisor Reviews With Sentiment Analysis

Advances in Business Information Systems and Analytics - Natural Language Processing for Global and Local Business ◽

10.4018/978-1-7998-4240-8.ch008 ◽

2021 ◽

pp. 180-199

Author(s):

Ainhoa Serna ◽

Jon Kepa Gerrikagoitia

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Language Processing ◽

Predictive Analytics ◽

Data Gathering ◽

Point Of View ◽

Training Data ◽

Complete Analysis ◽

Sustainable Transport ◽

Transport Modes

In recent years, digital technology and research methods have developed natural language processing for better understanding consumers and what they share in social media. There are hardly any studies in transportation analysis with TripAdvisor, and moreover, there is not a complete analysis from the point of view of sentiment analysis. The aim of study is to investigate and discover the presence of sustainable transport modes underlying in non-categorized TripAdvisor texts, such as walking mobility in order to impact positively in public services and businesses. The methodology follows a quantitative and qualitative approach based on knowledge discovery techniques. Thus, data gathering, normalization, classification, polarity analysis, and labelling tasks have been carried out to obtain sentiment labelled training data set in the transport domain as a valuable contribution for predictive analytics. This research has allowed the authors to discover sustainable transport modes underlying the texts, focused on walking mobility but extensible to other means of transport and social media sources.

Download Full-text