Text Classification Based on Nonlinear Dimensionality Reduction Techniques and Support Vector Machines

Cascade support vector machines have been introduced as extension of classic support vector machines that allow a fast training on large data sets. In this work, we combine cascade support vector machines with dimensionality reduction based preprocessing. The cascade principle allows fast learning based on the division of the training set into subsets and the union of cascade learning results based on support vectors in each cascade level. The combination with dimensionality reduction as preprocessing results in a significant speedup, often without loss of classifier accuracies, while considering the high-dimensional pendants of the low-dimensional support vectors in each new cascade level. We analyze and compare various instantiations of dimensionality reduction preprocessing and cascade SVMs with principal component analysis, locally linear embedding, and isometric mapping. The experimental analysis on various artificial and real-world benchmark problems includes various cascade specific parameters like intermediate training set sizes and dimensionalities.

Download Full-text

Multiple Support Vector Machines for Binary Text Classification Based on Sliding Window Technique

Communications in Computer and Information Science - Data Mining ◽

10.1007/978-981-13-6661-1_2 ◽

2019 ◽

pp. 17-29

Author(s):

Aisha Rashed Albqmi ◽

Yuefeng Li ◽

Yue Xu

Keyword(s):

Support Vector Machines ◽

Text Classification ◽

Sliding Window ◽

Support Vector ◽

Window Technique ◽

Vector Machines ◽

Multiple Support ◽

Multiple Support Vector Machines

Download Full-text

Recursive Support Vector Machines for Dimensionality Reduction

IEEE Transactions on Neural Networks ◽

10.1109/tnn.2007.908267 ◽

2008 ◽

Vol 19 (1) ◽

pp. 189-193 ◽

Cited By ~ 45

Author(s):

Qing Tao ◽

Dejun Chu ◽

Jue Wang

Keyword(s):

Support Vector Machines ◽

Dimensionality Reduction ◽

Support Vector ◽

Vector Machines

Download Full-text

Boosting support vector machines for text classification through parameter-free threshold relaxation

Proceedings of the twelfth international conference on Information and knowledge management - CIKM '03 ◽

10.1145/956863.956911 ◽

2003 ◽

Cited By ~ 6

Author(s):

James G. Shanahan ◽

Norbert Roma

Keyword(s):

Support Vector Machines ◽

Text Classification ◽

Support Vector ◽

Vector Machines

Download Full-text

Clustering Visualization and Class Prediction using Flask of Benchmark Dataset for Unsupervised Techniques in Machine learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g5943.059720 ◽

2020 ◽

Vol 9 (7) ◽

pp. 1297-1302 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Dimensionality Reduction ◽

Support Vector ◽

Decision Tree Classifier ◽

Class Prediction ◽

Linear Discriminant ◽

Reduction Techniques ◽

Tree Classifier ◽

Dimensionality Reduction Techniques ◽

Clustering Pattern

Cutting edge improved techniques gave greater values to Artificial Intelligence (AI) and Machine Learning (ML) which are becoming a part of interest rapidly for numerous types of researches presently. Clustering and Dimensionality Reduction Techniques are one of the trending methods utilized in Machine Learning these days. Fundamentally clustering techniques such as K-means and Hierarchical is utilized to predict the data and put it into the required group in a cluster format. Clustering can be utilized in recommendation frameworks, examination of clients related to social media platforms, patients related to particular diseases of specific age groups can be categorized, etc. While most aspects of the dimensionality lessening method such as Principal Component Analysis and Linear Discriminant Analysis are a bit like the clustering method but it decreases the data size and plots the cluster. In this paper, a comparative and predictive analysis is done utilizing three different datasets namely IRIS, Wine, and Seed from the UCI benchmark in Machine learning on four distinctive techniques. The class prediction analysis of the dataset is done employing a flask-app. The main aim is to form a good clustering pattern for each dataset for given techniques. The experimental analysis calculates the accuracy of the shaped clusters used different machine learning classifiers namely Logistic Regression, K-nearest neighbors, Support Vector Machine, Gaussian Naïve Bayes, Decision Tree Classifier, and Random Forest Classifier. Cohen Kappa is another accuracy indicator used to compare the obtained classification result. It is observed that Kmeans and Hierarchical clustering analysis provide a good clustering pattern of the input dataset than the dimensionality reduction techniques. Clustering Design is well-formed in all the techniques. The KNN classifier provides an improved accuracy in all the techniques of the dataset.

Download Full-text