Reputation Scoring Fake News Using Text Mining

Ahmad Firdaus

doi:10.33555/acmit.v4i1.52

Reputation Scoring Fake News Using Text Mining

ACMIT Proceedings ◽

10.33555/acmit.v4i1.52 ◽

2017 ◽

Vol 4 (1) ◽

pp. 12-17

Author(s):

Ahmad Firdaus

Keyword(s):

Feature Selection ◽

Decision Tree ◽

Text Categorization ◽

Information Gain ◽

Feature Selection Method ◽

Support Vector ◽

Stable Level ◽

Vector Machines ◽

Selection Of

The classification of hoax news or news with incorrect information is one of the text categorization applications.Like text-based categorization of machine applications in general, this system consists of pre-processing andexecution of classification models. In this study, experiments were conducted to select the best technique in each sub-process by using 1200 articles hoax and 600 articles no hoax collected manually. This research Triedexperimenting to determine the best preprocessing stages between stop removals and stemming and showing the results of the deception Tree algorithm achieving an accuracy of 100% concluded above naive byes more stable level of accuracy in the number of datasets used in all candidates. Information gain, TFIDF and GGA based on using Naive Byes algorithm, supporting Vector Machine and Decision Tree no significant percentage change occurred on all candidates. But after using GGA (Optimize Generation) feature selection there is an increase of accuracy level The results of a comparison of classification algorithms between Naive Byes, decision trees and Support Vector machines combined with the GGA feature selection method for classifying the best result is generated by the selection of GGA + Decision Tree feature on candidate 2 (Paslon2) 100% and in the selection of the Information Gain + Decision Tree Feature selection with the lowest accuracy Candidate 3 at 36.67%, but overall improvement of accuracy Occurred on all algorithm after using feature selection and Naive byes more stable level of accuracy in the number of datasets used in all candidates.

Download Full-text

Improved Feature-Selection Method Considering the Imbalance Problem in Text Categorization

The Scientific World JOURNAL ◽

10.1155/2014/625342 ◽

2014 ◽

Vol 2014 ◽

pp. 1-17 ◽

Cited By ~ 9

Author(s):

Jieming Yang ◽

Zhaoyang Qu ◽

Zhiying Liu

Keyword(s):

Feature Selection ◽

Text Categorization ◽

Information Gain ◽

Feature Selection Method ◽

Support Vector ◽

Selection Methods ◽

Document Collections ◽

Imbalance Problem ◽

Important Approach ◽

Selection Algorithms

The filtering feature-selection algorithm is a kind of important approach to dimensionality reduction in the field of the text categorization. Most of filtering feature-selection algorithms evaluate the significance of a feature for category based on balanced dataset and do not consider the imbalance factor of dataset. In this paper, a new scheme was proposed, which can weaken the adverse effect caused by the imbalance factor in the corpus. We evaluated the improved versions of nine well-known feature-selection methods (Information Gain, Chi statistic, Document Frequency, Orthogonal Centroid Feature Selection, DIA association factor, Comprehensive Measurement Feature Selection, Deviation from Poisson Feature Selection, improved Gini index, and Mutual Information) using naïve Bayes and support vector machines on three benchmark document collections (20-Newsgroups, Reuters-21578, and WebKB). The experimental results show that the improved scheme can significantly enhance the performance of the feature-selection methods.

Download Full-text

Design of Text Categorization System Based on SVM

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1191 ◽

2012 ◽

Vol 532-533 ◽

pp. 1191-1195 ◽

Cited By ~ 1

Author(s):

Zhen Yan Liu ◽

Wei Ping Wang ◽

Yong Wang

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Text Categorization ◽

Feature Selection Method ◽

Extraction Methods ◽

Support Vector ◽

Text Representation ◽

Text Feature ◽

Categorization System ◽

Classifier Training

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.

Download Full-text

Simplifying Support Vector Machines for classification of hyperspectral imagery and selection of relevant features

2010 2nd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing ◽

10.1109/whispers.2010.5594937 ◽

2010 ◽

Author(s):

Andreas Rabe ◽

Sebastian van der Linden ◽

Patrick Hostert

Keyword(s):

Support Vector Machines ◽

Hyperspectral Imagery ◽

Support Vector ◽

Vector Machines ◽

Selection Of

Download Full-text

FS/spl I.bar/SFS: a novel feature selection method for support vector machines

2004 IEEE International Conference on Acoustics, Speech, and Signal Processing ◽

10.1109/icassp.2004.1327231 ◽

2004 ◽

Author(s):

Yi Liu ◽

Y.F. Zheng

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Vector Machines

Download Full-text

Classification of ECG beats by using a fast least square support vector machines with a dynamic programming feature selection algorithm

Neural Computing and Applications ◽

10.1007/s00521-005-0466-z ◽

2005 ◽

Vol 14 (4) ◽

pp. 299-309 ◽

Cited By ~ 55

Author(s):

Nurettin Acır

Keyword(s):

Dynamic Programming ◽

Feature Selection ◽

Support Vector Machines ◽

Least Square ◽

Support Vector ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Vector Machines

Download Full-text

A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine

10.1117/12.813256 ◽

2008 ◽

Cited By ~ 36

Author(s):

Li Zhuo ◽

Jing Zheng ◽

Xia Li ◽

Fang Wang ◽

Bin Ai ◽

...

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Hyperspectral Images ◽

Support Vector ◽

Wrapper Feature Selection

Download Full-text

The Application of Multi-Class Support Vector Machines on Intrusion Detection System with the Feature Selection using Information Gain

Proceedings of the 1st Annual International Conference on Mathematics, Science, and Education (ICoMSE 2017) ◽

10.2991/icomse-17.2018.1 ◽

2018 ◽

Author(s):

Jihan Maharani ◽

Zuherman Rustam

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Information Gain ◽

Detection System ◽

Support Vector ◽

Vector Machines

Download Full-text

An innovative feature selection method for support vector machines and its test on the estimation of the credit risk of default

Review of Financial Economics ◽

10.1002/rfe.1049 ◽

2018 ◽

Vol 37 (3) ◽

pp. 404-427 ◽

Cited By ~ 1

Author(s):

Eduard Sariev ◽

Guido Germano

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Credit Risk ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Vector Machines

Download Full-text

SVM-Based Credit Rating and Feature Selection

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.618.573 ◽

2014 ◽

Vol 618 ◽

pp. 573-577 ◽

Cited By ~ 1

Author(s):

Yu Qiang Qin ◽

Yu Dong Qi ◽

Hui Ying

Keyword(s):

Logistic Regression ◽

Feature Selection ◽

Financial Institutions ◽

Credit Card ◽

Credit Rating ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Vector Machines ◽

Reference Agency

The assessment of risk of default on credit is important for financial institutions. Logistic regression and discriminant analysis are techniques traditionally used in credit rating for determining likelihood to default based on consumer application and credit reference agency data. We test support vector machines (SVM) against these traditional methods on a large credit card database. We find that they are competitive and can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default.

Download Full-text

A new feature selection method for text categorization based on information gain and particle swarm optimization

2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems ◽

10.1109/ccis.2014.7175792 ◽

2014 ◽

Cited By ~ 5

Author(s):

Ferruh Yigit ◽

Omer Kaan Baykan

Keyword(s):

Feature Selection ◽

Particle Swarm Optimization ◽

Text Categorization ◽

Information Gain ◽

Particle Swarm ◽

Feature Selection Method ◽

Selection Method ◽

Swarm Optimization ◽

New Feature

Download Full-text