scholarly journals Identifikasi Cyberbullying pada Kolom Komentar Instagram dengan Metode Support Vector Machine dan Semantic Similarity

Author(s):  
Lintani Afina Hajar Raudhoti ◽  
Anisa Herdiani ◽  
Ade Romadhony

Instagram merupakan laman media sosial berbagi foto dan video. Pengguna instagram biasanya melakukan aktivitas seperti mengunggah foto, saling mengikuti, menyukai hingga mengomentari setiap unggahan foto dan video. Namun, popularitas media sosial ini tidak lepas dari fenomena cyberbullying. Cyberbullying dapat didefinisikan sebagai penyalahgunaan teknologi melalui ponsel, e-mail, ruang berbicara atau sosial media untuk mempermalukan atau mengancam orang lain. Komentar yang termasuk kategori cyberbullying dapat menimbulkan efek negatif, terutama pada pihak yang diserang. Oleh karena itu, penelitian untuk mengidentifikasi kalimat cyberbullying menjadi hal yang penting. Identifikasi kalimat cyberbullying dapat dilakukan dengan pembelajaran mesin yang melibatkan pengetahuan korpus. Tugas Akhir ini menggunakan metode pembelajaran mesin Supoort Vector Machine (SVM) untuk dapat mengidentifikasi kalimat yang mengandung cyberbullying dan tidak. Akan tetapi, penggunaan metode klasifikasi SVM saja mempunyai kekurangan pada kondisi data uji yang mengandung kata-kata yang tidak terdapat pada data latih. Penambahan informasi kata-kata lain yang terkait secara semantik dapat meningkatkan performansi. Oleh karena itu, perlu ditambahkan informasi semantik keterkaitan antar kata yang diambil dari kamus untuk dapat meningkatkan akurasi identifikasi. Hasil yang diperoleh menunjukkan bahwa penambahan informasi semantik dapat meningkatkan performansi berupa akurasi pada tahap pengujian. Angka kenaikan yang diperoleh sebanyak 7% dari 67% menjadi 74%.

2021 ◽  
Vol 5 (3) ◽  
pp. 520-526
Author(s):  
Irbah salsabila ◽  
Yuliant Sibaroni

Beauty products are an important requirement for people, especially women. But, not all beauty products give the expected results. A review in the form of opinion can help the consumers to know the overview of the product. The reviews were analyzed using a multi-aspect-based approach to determine the aspects of the beauty category based on the reviews written on femaledaily.com. First, the review goes through the preprocessing stage to make it easier to be processed, and then it used the Support Vector Machine (SVM) method with the addition of Semantic Similarity and TF-IDF weighting. From the test result using semantic, get an accuracy of 93% on the price aspect, 92% on the packaging aspect, and 86% on the scent aspect.


2014 ◽  
Vol 536-537 ◽  
pp. 578-582
Author(s):  
Shu Hui Chang ◽  
Gui Fa Teng ◽  
Jian Bin Ma

E-mail has become one of the most important applications on the Internet. At the same time, computer crimes involving e-mail increases rapidly. To prevent these phenomena from happening, the authorship identification methods for Chinese e-mail documents were described in this paper, which could provide evidence for the purpose of computer forensic. E-mail form features to classify authorship were extracted. To classify the author of Chinese e-mail, the SVM(support vector machine) algorithm was adopted to learn the authors features. Experiments gained satisfactory results on limited dataset. The accuracy of dataset for four authors was 92.56%. The satisfactory results showed that it was feasible to apply to computer forensic.


2012 ◽  
Vol 268-270 ◽  
pp. 1844-1848
Author(s):  
Mu Hee Song

Due to the distribution of personal computers and the internet, E-mail has become one of the most widely used communicative means. However, a massive amount of spam mail is polluting mailboxes everyday, taking advantage of the ability to send mail to any number of random people through the internet. In this paper we will introduce an efficient method of classifying E-mails using the SVM(Support Vector Machine) learning algorithm, which is recently showing high performance in the field of classifying documents. The disposition of the words inside the E-mail documents are extracted, and the performance of classification is compared and examined through the learning based on the change of DF value which occurs to reduce the disposition space in the learning level. To assess the performance of the SVM, the SVM is compared to the Naïve Bayes classifier (which uses probability methods) and a vector model classifier in order to verify that the method of using the learning algorithm of SVM shows better performance.


2021 ◽  
Vol 12 (3) ◽  
pp. 31-38
Author(s):  
Michelle Tais Garcia Furuya ◽  
Danielle Elis Garcia Furuya

The e-mail service is one of the main tools used today and is an example that technology facilitates the exchange of information. On the other hand, one of the biggest obstacles faced by e-mail services is spam, the name given to the unsolicited message received by a user. The machine learning application has been gaining prominence in recent years as an alternative for efficient identification of spam. In this area, different algorithms can be evaluated to identify which one has the best performance. The aim of the study is to identify the ability of machine learning algorithms to correctly classify e-mails and also to identify which algorithm obtained the greatest accuracy. The database used was taken from the Kaggle platform and the data were processed bythe Orange software with four algorithms: Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Naive Bayes (NB). The division of data in training and testing considers 80% of the data for training and 20% for testing. The results show that Random Forest was the best performing algorithm with 99% accuracy.


Kybernetes ◽  
2016 ◽  
Vol 45 (6) ◽  
pp. 977-994 ◽  
Author(s):  
Oluyinka Aderemi Adewumi ◽  
Ayobami Andronicus Akinyelu

Purpose – Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars has been lost by many companies and individuals. The global impact of phishing attacks will continue to be on the increase and thus a more efficient phishing detection technique is required. The purpose of this paper is to investigate and report the use of a nature inspired based-machine learning (ML) approach in classification of phishing e-mails. Design/methodology/approach – ML-based techniques have been shown to be efficient in detecting phishing attacks. In this paper, firefly algorithm (FFA) was integrated with support vector machine (SVM) with the primary aim of developing an improved phishing e-mail classifier (known as FFA_SVM), capable of accurately detecting new phishing patterns as they occur. From a data set consisting of 4,000 phishing and ham e-mails, a set of features, suitable for phishing e-mail detection, was extracted and used to construct the hybrid classifier. Findings – The FFA_SVM was applied to a data set consisting of up to 4,000 phishing and ham e-mails. Simulation experiments were performed to evaluate and compared the performance of the classifier. The tests yielded a classification accuracy of 99.94 percent, false positive rate of 0.06 percent and false negative rate of 0.04 percent. Originality/value – The hybrid algorithm has not been earlier apply, as in this work, to the classification and detection of phishing e-mail, to the best of the authors’ knowledge.


Author(s):  
Alistair Shilton ◽  
Marimuthu Palaniswami

This chapter presents a unified introduction to support vector machine (SVM) methods for binary classification, one-class classification, and regression. The SVM method for binary classification (binary SVC) is introduced first, and then extended to encompass one-class classification (clustering). Next, using the regularized risk approach as a motivation, the SVM method for regression (SVR) is described. These methods are then combined to obtain a single unified SVM formulation that encompasses binary classification, one-class classification, and regression (as well as some extensions of these), and the dual formulation of this unified model is derived. A mechanical analogy for binary and one-class SVCs is given to give an intuitive explanation of the operation of these two formulations. Finally, the unified SVM is extended to implement general cost functions, and an application of SVM classifiers to the problem of spam e-mail detection is considered.


Sign in / Sign up

Export Citation Format

Share Document