scholarly journals Cyberbullying identification in twitter using support vector machine and information gain based feature selection

Author(s):  
Ni Made Gita Dwi Purnamasari ◽  
M. Ali Fauzi ◽  
Indriati Indriati ◽  
Liana Shinta Dewi

<span>Cyberbullying is one of the actions that violate the ITE Law where the crime is committed on social media applications such as Twitter. This action is difficult to detect if no one is reporting the tweet. Cyberbullying tweet identification aims to classify tweets that contain bullying. Classification is done using Support Vector Machine method where this method aims to find the dividing hyperplane between negative and positive class. This study is a text classification where more data is used, the more features are produced, therefore this research also uses Information Gain as feature selection to select features that are not relevant to the classification. The process of the system starts from text preprocessing with tokenizing, filtering, stemming and term weighting. Then perform the information gain feature selection by calculating the entropy value of each term. After that perform the classification process based on the terms that have been selected, and the output of the system is identification whether the tweet is bullying or not. The result of using SVM method is accuracy 75%, precision 70.27%, recall 86.66% and f-measure 77.61% on experiment maximum iteration = 20, λ = 0.5, γ = 0.001, ε = 0.000001, and C = 1. The best threshold of information gain is 90%, with accuracy 76.66%, precision 72.22%, recall 86.66% and f-measure 78.78%.</span>

2020 ◽  
Vol 11 (2) ◽  
pp. 107-111
Author(s):  
Christevan Destitus ◽  
Wella Wella ◽  
Suryasari Suryasari

This study aims to clarify tweets on twitter using the Support Vector Machine and Information Gain methods. The clarification itself aims to find a hyperplane that separates the negative and positive classes. In the research stage, there is a system process, namely text mining, text processing which has stages of tokenizing, filtering, stemming, and term weighting. After that, a feature selection is made by information gain which calculates the entropy value of each word. After that, clarify based on the features that have been selected and the output is in the form of identifying whether the tweet is bully or not. The results of this study found that the Support Vector Machine and Information Gain methods have sufficiently maximum results.


2017 ◽  
Vol 9 (1) ◽  
pp. 168781401668596 ◽  
Author(s):  
Fuqiang Sun ◽  
Xiaoyang Li ◽  
Haitao Liao ◽  
Xiankun Zhang

Rapid and accurate lifetime prediction of critical components in a system is important to maintaining the system’s reliable operation. To this end, many lifetime prediction methods have been developed to handle various failure-related data collected in different situations. Among these methods, machine learning and Bayesian updating are the most popular ones. In this article, a Bayesian least-squares support vector machine method that combines least-squares support vector machine with Bayesian inference is developed for predicting the remaining useful life of a microwave component. A degradation model describing the change in the component’s power gain over time is developed, and the point and interval remaining useful life estimates are obtained considering a predefined failure threshold. In our case study, the radial basis function neural network approach is also implemented for comparison purposes. The results indicate that the Bayesian least-squares support vector machine method is more precise and stable in predicting the remaining useful life of this type of components.


Sign in / Sign up

Export Citation Format

Share Document