A Survey on Feature Selection Techniques and Classification Algorithms for Efficient Text Classification

: In this era of Internet, the issue of security of information is at its peak. One of the main threats in this cyber world is phishing attacks which is an email or website fraud method that targets the genuine webpage or an email and hacks it without the consent of the end user. There are various techniques which help to classify whether the website or an email is legitimate or fake. The major contributors in the process of detection of these phishing frauds include the classification algorithms, feature selection techniques or dataset preparation methods and the feature extraction that plays an important role in detection as well as in prevention of these attacks. This Survey Paper studies the effect of all these contributors and the approaches that are applied in the study conducted on the recent papers. Some of the classification algorithms that are implemented includes Decision tree, Random Forest , Support Vector Machines, Logistic Regression , Lazy K Star, Naive Bayes and J48 etc.

Download Full-text

Impact of feature selection techniques in Text Classification: An Experimental study

JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES ◽

10.26782/jmcms.spl.3/2019.09.00004 ◽

2019 ◽

Vol 1 (3) ◽

Author(s):

S Rahamat Basha

Keyword(s):

Experimental Study ◽

Feature Selection ◽

Text Classification ◽

Feature Selection Techniques

Download Full-text

Performance Analysis of Feature Selection Techniques for Text Classification

International Research Journal on Advanced Science Hub ◽

10.47392/irjash.2020.259 ◽

2020 ◽

Vol 2 (Special Issue ICSTM 12S) ◽

pp. 44-50

Author(s):

Hemlata Patel ◽

Dhanraj Verma

Keyword(s):

Feature Selection ◽

Performance Analysis ◽

Text Classification ◽

Feature Selection Techniques

Download Full-text

A Comparative Study of Recent Feature Selection Techniques Used in Text Classification

IOT with Smart Systems - Smart Innovation, Systems and Technologies ◽

10.1007/978-981-16-3945-6_41 ◽

2022 ◽

pp. 423-436

Author(s):

Gunjan Singh ◽

Rashmi Priya

Keyword(s):

Feature Selection ◽

Comparative Study ◽

Text Classification ◽

Feature Selection Techniques

Download Full-text

Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2015.060228 ◽

2015 ◽

Vol 6 (2) ◽

Cited By ~ 6

Author(s):

Ghazi Raho ◽

Riyad Al-Shalabi ◽

Ghassan Kanaan ◽

Asmaa Nassar

Keyword(s):

Feature Selection ◽

Comparative Study ◽

Text Classification ◽

Classification Algorithms ◽

Arabic Text ◽

Classification Feature ◽

Arabic Text Classification

Download Full-text

Efficient n-gram construction for text categorization using feature selection techniques

Intelligent Data Analysis ◽

10.3233/ida-205154 ◽

2021 ◽

Vol 25 (3) ◽

pp. 509-525

Author(s):

Maximiliano García ◽

Sebastián Maldonado ◽

Carla Vairetti

Keyword(s):

Feature Selection ◽

Text Classification ◽

Text Categorization ◽

A Priori ◽

Predictive Performance ◽

Online Reviews ◽

Additional Advantage ◽

Novel Approach ◽

N Gram ◽

Feature Selection Techniques

In this paper, we present a novel approach for n-gram generation in text classification. The a-priori algorithm is adapted to prune word sequences by combining three feature selection techniques. Unlike the traditional two-step approach for text classification in which feature selection is performed after the n-gram construction process, our proposal performs an embedded feature elimination during the application of the a-priori algorithm. The proposed strategy reduces the number of branches to be explored, speeding up the process and making the construction of all the word sequences tractable. Our proposal has the additional advantage of constructing a low-dimensional dataset with only the features that are relevant for classification, that can be used directly without the need for a feature selection step. Experiments on text classification datasets for sentiment analysis demonstrate that our approach yields the best predictive performance when compared with other feature selection approaches, while also facilitating a better understanding of the words and phrases that explain a given task; in our case online reviews and ratings in various domains.

Download Full-text