scholarly journals Improvised Spam Detection in Twitter Data Using Lightweight Detectors and Classifiers

Author(s):  
Velammal B. L. ◽  
Aarthy N.

Receiving spam messages is one of the most serious issues in social media, especially in Twitter, which is a widely used platform to reflect the opinions and emotions of an individual publicly as well as focused to a specific group of members with similar thoughts or discussion topic. In such focused discussion groups, getting spam message through social media sites is the most annoying issue. In this paper, a system is developed to detect spam tweets by using four lightweight detectors, namely blacklist domain detector, near duplicate detector, reliable ham detector, and multiclass detector. The detected tweets are then classified using ensemble classifiers such as naïve Bayes, logistic regression, and random forest. Voting method is applied to decide the labels for the tweets obtained after classification process. The proposed system has achieved an accuracy of 79% to detect spam tweets with the help of naïve Bayes classifier method and the value seems to be optimizing further with the availability of more sample data.

2020 ◽  
Vol 1 (2) ◽  
pp. 61-66
Author(s):  
Febri Astiko ◽  
Achmad Khodar

This study aims to design a machine learning model of sentiment analysis on Indosat Ooredoo service reviews on social media twitter using the Naive Bayes algorithm as a classifier of positive and negative labels. This sentiment analysis uses machine learning to get patterns an model that can be used again to predict new data.


Author(s):  
Chaithra V. D

<p align="justify">Revolution in social media has attracted the users towards video sharing sites like YouTube. It is the most popular social media site where people view, share and interact by commenting on the videos. There are various types of videos that are shared by the users like songs, movie trailers, news, entertainment etc. Nowadays the most trending videos is the unboxing videos and in particular unboxing of mobile phones which gets more views, likes/dislikes and comments. Analyzing the comments of the mobile unboxing videos provides the opinion of the viewers towards the mobile phone. Studying the sentiment expressed in these comments show if the mobile phone is getting positive or negative feedback. A Hybrid approach combining the lexicon approach Sentiment VADER and machine learning algorithm Naive Bayes is applied on the comments to predict the sentiment. Sentiment VADER has a good impact on the Naive Bayes classifier in predicting the sentiment of the comment. The classifier achieves an accuracy of 79.78% and F1 score of 83.72%.</p>


Author(s):  
Mohammad Zoqi Sarwani ◽  
Muhammad Shubkhan Salafudin ◽  
Dian Ahkam Sani

With the development of social media trends among students by using Facebook social media, students can communicate and pour out everything that is felt in the form of status. Personality is the character or various characters of a person - therefore, how a person to adjust to the surrounding environment for the achievement of communication smoothly. In the personality category, many things classify a person's category in the psychologist theory. In this exercise, the Big Five, the psychologist theory, is described in five codes, namely Openness, Conscientiousness, Extraversion, Agreeables, Neuroticism. Naive Bayes Classifier is used to determine the highest probability value with the aim to determine the highest value. The data used are two namely training data and testing data obtained from the Facebook status of students. From the data obtained can be tested in the system that the accuracy value is 88%.


2019 ◽  
Vol 24 (2) ◽  
pp. 140-153
Author(s):  
Gusti Nur Aulia ◽  
Eka Patriya

Pilpres saat ini cukup menyita perhatian, karena berbagai rumor yang beredar. Masyarakat juga menjadi sasaran elit politik, dimana suara mereka merupakan penentu keberlangsungan arah politik untuk lima tahun kedepan. Opini-opini positif, netral maupun negatif dapat menimbulkan ancaman munculnya berita bohong (hoax). Salah satu sarana yang digunakan masyarakat dalam mengekspresikan pilihan politiknya adalah melalui media sosial salah satunya twitter. Data seperti opini publik dapat diolah menjadi sebuah informasi yang bermanfaat, salah satunya melalui analisis sentimen. Pada penelitian ini, akan dilakukan analisis sentimen pada Twitter tentang pemilihan presiden 2019. Tahapan analisis sentimen pada penelitian ini terdiri dari akuisisi data, pre-processing, klasifikasi data, evaluasi data dan visualisasi data. Preprocessing dilakukan dengan case folding, normalisasi data, filtering, ubah kata baku, stopword dan stemming. Penelitian ini melakukan 2 metode yaitu dengan metode Lexicon Based dan Naïve Bayes Classifier. Hasil akhir dari analisis kemudian dihitung nilai akurasi menggunakan confusion matrix dan di visualisasikan menggunakan web server. Penentuan sentimen prediksi dilakukan menggunakan metode Lexicon Based dan Labelisasi dengan perhitungan secara manual. Data latih dan data uji akan digunakan dalam proses pelatihan dan pengujian menggunakan Naive Bayes Classifier. Hasil klasifikasi yang dilakukan oleh metode Naive Bayes Classifier disebut sentimen aktual. Perhitungan tingkat keakurasian antara sentimen prediksi terhadap sentimen aktual menggunakan pengujian confusion matrix. Hasil yang didapatkan adalah tingkat akurasi antara sentimen prediksi dan sentimen aktual dengan Lexicon Based sebesar 64,49% pada data uji dan pada data latih sebanyak 94,2% serta dengan menggunakan Labelisasi dan Naive Bayes Classifier sebesar 86,53% pada data uji dan data latih sebesar 94,08%. Hasil penelitian ini diharapkan dapat membantu melakukan riset atas opini masyarakat pada Twitter mengenai Pilpres 2019 yang mengandung sentimen positif, negatif atau netral.


Sign in / Sign up

Export Citation Format

Share Document