Cyberbullying identification in twitter using support vector machine and information gain based feature selection

<span>Cyberbullying is one of the actions that violate the ITE Law where the crime is committed on social media applications such as Twitter. This action is difficult to detect if no one is reporting the tweet. Cyberbullying tweet identification aims to classify tweets that contain bullying. Classification is done using Support Vector Machine method where this method aims to find the dividing hyperplane between negative and positive class. This study is a text classification where more data is used, the more features are produced, therefore this research also uses Information Gain as feature selection to select features that are not relevant to the classification. The process of the system starts from text preprocessing with tokenizing, filtering, stemming and term weighting. Then perform the information gain feature selection by calculating the entropy value of each term. After that perform the classification process based on the terms that have been selected, and the output of the system is identification whether the tweet is bullying or not. The result of using SVM method is accuracy 75%, precision 70.27%, recall 86.66% and f-measure 77.61% on experiment maximum iteration = 20, λ = 0.5, γ = 0.001, ε = 0.000001, and C = 1. The best threshold of information gain is 90%, with accuracy 76.66%, precision 72.22%, recall 86.66% and f-measure 78.78%.</span>

Download Full-text

Support Vector Machine VS Information Gain: Analisis Sentimen Cyberbullying di Twitter Indonesia

Jurnal ULTIMA InfoSys ◽

10.31937/si.v11i2.1740 ◽

2020 ◽

Vol 11 (2) ◽

pp. 107-111

Author(s):

Christevan Destitus ◽

Wella Wella ◽

Suryasari Suryasari

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Mining ◽

Information Gain ◽

Text Processing ◽

Support Vector ◽

Term Weighting ◽

System Process ◽

Research Stage

This study aims to clarify tweets on twitter using the Support Vector Machine and Information Gain methods. The clarification itself aims to find a hyperplane that separates the negative and positive classes. In the research stage, there is a system process, namely text mining, text processing which has stages of tokenizing, filtering, stemming, and term weighting. After that, a feature selection is made by information gain which calculates the entropy value of each word. After that, clarify based on the features that have been selected and the output is in the form of identifying whether the tweet is bully or not. The results of this study found that the Support Vector Machine and Information Gain methods have sufficiently maximum results.

Download Full-text

Extraction of built-up areas from Landsat-8 OLI data based on spectral-textural information and feature selection using support vector machine method

Geocarto International ◽

10.1080/10106049.2019.1566406 ◽

2019 ◽

Vol 35 (10) ◽

pp. 1067-1087 ◽

Cited By ~ 2

Author(s):

Vijendra Singh Bramhe ◽

Sanjay Kumar Ghosh ◽

Pradeep Kumar Garg

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Support Vector ◽

Landsat 8 ◽

Landsat 8 Oli ◽

Machine Method ◽

Support Vector Machine Method

Download Full-text

Modified multiscale weighted permutation entropy and optimized support vector machine method for rolling bearing fault diagnosis with complex signals

ISA Transactions ◽

10.1016/j.isatra.2020.12.054 ◽

2021 ◽

Author(s):

Zhenya Wang ◽

Ligang Yao ◽

Gang Chen ◽

Jiaxin Ding

Keyword(s):

Support Vector Machine ◽

Fault Diagnosis ◽

Rolling Bearing ◽

Permutation Entropy ◽

Support Vector ◽

Complex Signals ◽

Machine Method ◽

Support Vector Machine Method ◽

Bearing Fault ◽

Bearing Fault Diagnosis

Download Full-text

Prediction of the resilient modulus of non-cohesive subgrade soils and unbound subbase materials using a hybrid support vector machine method and colliding bodies optimization algorithm

Construction and Building Materials ◽

10.1016/j.conbuildmat.2020.122140 ◽

2021 ◽

Vol 275 ◽

pp. 122140

Author(s):

Nasrin Heidarabadizadeh ◽

Ali Reza Ghanizadeh ◽

Ali Behnood

Keyword(s):

Support Vector Machine ◽

Optimization Algorithm ◽

Resilient Modulus ◽

Support Vector ◽

Machine Method ◽

Support Vector Machine Method ◽

Subgrade Soils ◽

Colliding Bodies Optimization

Download Full-text

Personality Classification of Facebook Users According to Big Five Personality Using SVM (Support Vector Machine) Method

Procedia Computer Science ◽

10.1016/j.procs.2020.12.023 ◽

2021 ◽

Vol 179 ◽

pp. 177-184

Author(s):

Ninda Anggoro Utami ◽

Warih Maharani ◽

Imelda Atastina

Keyword(s):

Support Vector Machine ◽

Big Five ◽

Support Vector ◽

Big Five Personality ◽

Machine Method ◽

Support Vector Machine Method ◽

Personality Classification

Download Full-text

A Bayesian least-squares support vector machine method for predicting the remaining useful life of a microwave component

Advances in Mechanical Engineering ◽

10.1177/1687814016685963 ◽

2017 ◽

Vol 9 (1) ◽

pp. 168781401668596 ◽

Cited By ~ 9

Author(s):

Fuqiang Sun ◽

Xiaoyang Li ◽

Haitao Liao ◽

Xiankun Zhang

Keyword(s):

Support Vector Machine ◽

Least Squares ◽

Remaining Useful Life ◽

Lifetime Prediction ◽

Support Vector ◽

Machine Method ◽

Support Vector Machine Method ◽

Microwave Component ◽

Useful Life ◽

Bayesian Least Squares

Rapid and accurate lifetime prediction of critical components in a system is important to maintaining the system’s reliable operation. To this end, many lifetime prediction methods have been developed to handle various failure-related data collected in different situations. Among these methods, machine learning and Bayesian updating are the most popular ones. In this article, a Bayesian least-squares support vector machine method that combines least-squares support vector machine with Bayesian inference is developed for predicting the remaining useful life of a microwave component. A degradation model describing the change in the component’s power gain over time is developed, and the point and interval remaining useful life estimates are obtained considering a predefined failure threshold. In our case study, the radial basis function neural network approach is also implemented for comparison purposes. The results indicate that the Bayesian least-squares support vector machine method is more precise and stable in predicting the remaining useful life of this type of components.

Download Full-text