Identifikasi Cyberbullying pada Kolom Komentar Instagram dengan Metode Support Vector Machine dan Semantic Similarity

Instagram merupakan laman media sosial berbagi foto dan video. Pengguna instagram biasanya melakukan aktivitas seperti mengunggah foto, saling mengikuti, menyukai hingga mengomentari setiap unggahan foto dan video. Namun, popularitas media sosial ini tidak lepas dari fenomena cyberbullying. Cyberbullying dapat didefinisikan sebagai penyalahgunaan teknologi melalui ponsel, e-mail, ruang berbicara atau sosial media untuk mempermalukan atau mengancam orang lain. Komentar yang termasuk kategori cyberbullying dapat menimbulkan efek negatif, terutama pada pihak yang diserang. Oleh karena itu, penelitian untuk mengidentifikasi kalimat cyberbullying menjadi hal yang penting. Identifikasi kalimat cyberbullying dapat dilakukan dengan pembelajaran mesin yang melibatkan pengetahuan korpus. Tugas Akhir ini menggunakan metode pembelajaran mesin Supoort Vector Machine (SVM) untuk dapat mengidentifikasi kalimat yang mengandung cyberbullying dan tidak. Akan tetapi, penggunaan metode klasifikasi SVM saja mempunyai kekurangan pada kondisi data uji yang mengandung kata-kata yang tidak terdapat pada data latih. Penambahan informasi kata-kata lain yang terkait secara semantik dapat meningkatkan performansi. Oleh karena itu, perlu ditambahkan informasi semantik keterkaitan antar kata yang diambil dari kamus untuk dapat meningkatkan akurasi identifikasi. Hasil yang diperoleh menunjukkan bahwa penambahan informasi semantik dapat meningkatkan performansi berupa akurasi pada tahap pengujian. Angka kenaikan yang diperoleh sebanyak 7% dari 67% menjadi 74%.

Download Full-text

An Approach for Spam E-mail Detection with Support Vector Machine and n-Gram Indexing

Lecture Notes in Computer Science - Computer and Information Sciences - ISCIS 2004 ◽

10.1007/978-3-540-30182-0_36 ◽

2004 ◽

pp. 351-362 ◽

Cited By ~ 7

Author(s):

Jongsub Moon ◽

Taeshik Shon ◽

Jungtaek Seo ◽

Jongho Kim ◽

Jungwoo Seo

Keyword(s):

Support Vector Machine ◽

Support Vector ◽

N Gram ◽

E Mail

Download Full-text

Multi Aspect Sentiment of Beauty Product Reviews using SVM and Semantic Similarity

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i3.3078 ◽

2021 ◽

Vol 5 (3) ◽

pp. 520-526

Author(s):

Irbah salsabila ◽

Yuliant Sibaroni

Keyword(s):

Support Vector Machine ◽

Semantic Similarity ◽

Support Vector ◽

Product Reviews ◽

Important Requirement ◽

Test Result ◽

Beauty Products

Beauty products are an important requirement for people, especially women. But, not all beauty products give the expected results. A review in the form of opinion can help the consumers to know the overview of the product. The reviews were analyzed using a multi-aspect-based approach to determine the aspects of the beauty category based on the reviews written on femaledaily.com. First, the review goes through the preprocessing stage to make it easier to be processed, and then it used the Support Vector Machine (SVM) method with the addition of Semantic Similarity and TF-IDF weighting. From the test result using semantic, get an accuracy of 93% on the price aspect, 92% on the packaging aspect, and 86% on the scent aspect.

Download Full-text

Authorship Identification of Chinese E-Mail Based on Form Features for Computer Forensic

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.536-537.578 ◽

2014 ◽

Vol 536-537 ◽

pp. 578-582

Author(s):

Shu Hui Chang ◽

Gui Fa Teng ◽

Jian Bin Ma

Keyword(s):

Support Vector Machine ◽

Support Vector ◽

The Internet ◽

Support Vector Machine Algorithm ◽

Identification Methods ◽

Computer Crimes ◽

Form Features ◽

Authorship Identification ◽

Computer Forensic ◽

E Mail

E-mail has become one of the most important applications on the Internet. At the same time, computer crimes involving e-mail increases rapidly. To prevent these phenomena from happening, the authorship identification methods for Chinese e-mail documents were described in this paper, which could provide evidence for the purpose of computer forensic. E-mail form features to classify authorship were extracted. To classify the author of Chinese e-mail, the SVM(support vector machine) algorithm was adopted to learn the authors features. Experiments gained satisfactory results on limited dataset. The accuracy of dataset for four authors was 92.56%. The satisfactory results showed that it was feasible to apply to computer forensic.

Download Full-text

E-Mail Classification Based Learning Algorithm Using Support Vector Machine

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.268-270.1844 ◽

2012 ◽

Vol 268-270 ◽

pp. 1844-1848

Author(s):

Mu Hee Song

Keyword(s):

Support Vector Machine ◽

High Performance ◽

Learning Algorithm ◽

Vector Model ◽

Support Vector ◽

The Internet ◽

Machine Learning Algorithm ◽

Bayes Classifier ◽

Probability Methods ◽

E Mail

Due to the distribution of personal computers and the internet, E-mail has become one of the most widely used communicative means. However, a massive amount of spam mail is polluting mailboxes everyday, taking advantage of the ability to send mail to any number of random people through the internet. In this paper we will introduce an efficient method of classifying E-mails using the SVM(Support Vector Machine) learning algorithm, which is recently showing high performance in the field of classifying documents. The disposition of the words inside the E-mail documents are extracted, and the performance of classification is compared and examined through the learning based on the change of DF value which occurs to reduce the disposition space in the learning level. To assess the performance of the SVM, the SVM is compared to the Naïve Bayes classifier (which uses probability methods) and a vector model classifier in order to verify that the method of using the learning algorithm of SVM shows better performance.

Download Full-text

E-mail spam classification using S-cuckoo search and support vector machine

International Journal of Bio-Inspired Computation ◽

10.1504/ijbic.2017.10004278 ◽

2017 ◽

Vol 9 (3) ◽

pp. 142

Author(s):

C. Palanisamy ◽

T. Kumaresan

Keyword(s):

Support Vector Machine ◽

Cuckoo Search ◽

Support Vector ◽

E Mail

Download Full-text

APLICAÇÃO DE MACHINE LEARNING NA IDENTIFICAÇÃO DE E-MAILS COMO SPAM

Colloquium Exactarum ◽

10.5747/ce.2020.v12.n3.e327 ◽

2021 ◽

Vol 12 (3) ◽

pp. 31-38

Author(s):

Michelle Tais Garcia Furuya ◽

Danielle Elis Garcia Furuya

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

The Other ◽

Support Vector ◽

K Nearest Neighbors ◽

Mail Service ◽

E Mail

The e-mail service is one of the main tools used today and is an example that technology facilitates the exchange of information. On the other hand, one of the biggest obstacles faced by e-mail services is spam, the name given to the unsolicited message received by a user. The machine learning application has been gaining prominence in recent years as an alternative for efficient identification of spam. In this area, different algorithms can be evaluated to identify which one has the best performance. The aim of the study is to identify the ability of machine learning algorithms to correctly classify e-mails and also to identify which algorithm obtained the greatest accuracy. The database used was taken from the Kaggle platform and the data were processed bythe Orange software with four algorithms: Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Naive Bayes (NB). The division of data in training and testing considers 80% of the data for training and 20% for testing. The results show that Random Forest was the best performing algorithm with 99% accuracy.

Download Full-text

A hybrid firefly and support vector machine classifier for phishing email detection

Kybernetes ◽

10.1108/k-07-2014-0129 ◽

2016 ◽

Vol 45 (6) ◽

pp. 977-994 ◽

Cited By ~ 10

Author(s):

Oluyinka Aderemi Adewumi ◽

Ayobami Andronicus Akinyelu

Keyword(s):

Support Vector Machine ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Support Vector ◽

Data Set ◽

Content Type ◽

Phishing Attacks ◽

Positive Rate ◽

E Mail

Purpose – Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars has been lost by many companies and individuals. The global impact of phishing attacks will continue to be on the increase and thus a more efficient phishing detection technique is required. The purpose of this paper is to investigate and report the use of a nature inspired based-machine learning (ML) approach in classification of phishing e-mails. Design/methodology/approach – ML-based techniques have been shown to be efficient in detecting phishing attacks. In this paper, firefly algorithm (FFA) was integrated with support vector machine (SVM) with the primary aim of developing an improved phishing e-mail classifier (known as FFA_SVM), capable of accurately detecting new phishing patterns as they occur. From a data set consisting of 4,000 phishing and ham e-mails, a set of features, suitable for phishing e-mail detection, was extracted and used to construct the hybrid classifier. Findings – The FFA_SVM was applied to a data set consisting of up to 4,000 phishing and ham e-mails. Simulation experiments were performed to evaluate and compared the performance of the classifier. The tests yielded a classification accuracy of 99.94 percent, false positive rate of 0.06 percent and false negative rate of 0.04 percent. Originality/value – The hybrid algorithm has not been earlier apply, as in this work, to the classification and detection of phishing e-mail, to the best of the authors’ knowledge.

Download Full-text

A Unified Approach to Support Vector Machines

Pattern Recognition Technologies and Applications ◽

10.4018/978-1-59904-807-9.ch014 ◽

2008 ◽

pp. 299-324 ◽

Cited By ~ 1

Author(s):

Alistair Shilton ◽

Marimuthu Palaniswami

Keyword(s):

Support Vector Machine ◽

Binary Classification ◽

Support Vector ◽

Dual Formulation ◽

Vector Machines ◽

Risk Approach ◽

Classification And Regression ◽

One Class Classification ◽

General Cost Functions ◽

E Mail

This chapter presents a unified introduction to support vector machine (SVM) methods for binary classification, one-class classification, and regression. The SVM method for binary classification (binary SVC) is introduced first, and then extended to encompass one-class classification (clustering). Next, using the regularized risk approach as a motivation, the SVM method for regression (SVR) is described. These methods are then combined to obtain a single unified SVM formulation that encompasses binary classification, one-class classification, and regression (as well as some extensions of these), and the dual formulation of this unified model is derived. A mechanical analogy for binary and one-class SVCs is given to give an intuitive explanation of the operation of these two formulations. Finally, the unified SVM is extended to implement general cost functions, and an application of SVM classifiers to the problem of spam e-mail detection is considered.

Download Full-text