Using the Leader Algorithm with Support Vector Machines for Large Data Sets

In order to improve the forecasting accuracy of indoor thermal comfort, the basic principle of fuzzy c-means clustering algorithm (FCM) and support vector machines (SVM) is analyzed. A kind of SVM forecasting method based on FCM data preprocess is proposed in this paper. The large data sets can be divided into multiple mixed groups and each group is represented by a single regression model using the proposed method. The support vector machines based on fuzzy c-means clustering algorithm (FCM+SVM) and the BP neural network based on fuzzy c-means clustering algorithm (FCM+BPNN) are respectively applied to forecast PMV index. The experimental results demonstrate that the FCM+SVM method has better forecasting accuracy compared with FCM+BPNN method.

Download Full-text

Text Classification Using Ensemble Of Non-Linear Support Vector Machines

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9520.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 3169-3174

Keyword(s):

Support Vector Machines ◽

Large Data ◽

Training Data ◽

Supervised Machine Learning ◽

Support Vector ◽

Data Sets ◽

Learning Approaches ◽

Automatic Tagging ◽

Vector Machines

With the advent of digital era, billions of the documents generate every day that need to be managed, processed and classified. Enormous size of text data is available on world wide web and other sources. As a first step of managing this mammoth data is the classification of available documents in right categories. Supervised machine learning approaches try to solve the problem of document classification but working on large data sets of heterogeneous classes is a big challenge. Automatic tagging and classification of the text document is a useful task due to its many potential applications such as classifying emails into spam or non-spam categories, news articles into political, entertainment, stock market, sports news, etc. The paper proposes a novel approach for classifying the text into known classes using an ensemble of refined Support Vector Machines. The advantage of proposed technique is that it can considerably reduce the size of the training data by adopting dimensionality reduction as pre-training step. The proposed technique has been used on three bench-marked data sets namely CMU Dataset, 20 Newsgroups Dataset, and Classic Dataset. Experimental results show that proposed approach is more accurate and efficient as compared to other state-of-the-art methods.

Download Full-text

Randomized kernel methods for least-squares support vector machines

International Journal of Modern Physics C ◽

10.1142/s0129183117500152 ◽

2017 ◽

Vol 28 (02) ◽

pp. 1750015 ◽

Cited By ~ 1

Author(s):

M. Andrecut

Keyword(s):

Least Squares ◽

Kernel Method ◽

Large Data ◽

Large Data Sets ◽

Support Vector ◽

Svm Classifier ◽

Data Sets ◽

Classification Problems ◽

Vector Machines ◽

Multi Class Classification

The least-squares support vector machine (LS-SVM) is a frequently used kernel method for non-linear regression and classification tasks. Here we discuss several approximation algorithms for the LS-SVM classifier. The proposed methods are based on randomized block kernel matrices, and we show that they provide good accuracy and reliable scaling for multi-class classification problems with relatively large data sets. Also, we present several numerical experiments that illustrate the practical applicability of the proposed methods.

Download Full-text

Cascade Support Vector Machines with Dimensionality Reduction

Applied Computational Intelligence and Soft Computing ◽

10.1155/2015/216132 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Oliver Kramer

Keyword(s):

Support Vector Machines ◽

Dimensionality Reduction ◽

Principal Component ◽

Large Data ◽

Locally Linear Embedding ◽

Benchmark Problems ◽

Support Vector ◽

Training Set ◽

Support Vectors ◽

Vector Machines

Cascade support vector machines have been introduced as extension of classic support vector machines that allow a fast training on large data sets. In this work, we combine cascade support vector machines with dimensionality reduction based preprocessing. The cascade principle allows fast learning based on the division of the training set into subsets and the union of cascade learning results based on support vectors in each cascade level. The combination with dimensionality reduction as preprocessing results in a significant speedup, often without loss of classifier accuracies, while considering the high-dimensional pendants of the low-dimensional support vectors in each new cascade level. We analyze and compare various instantiations of dimensionality reduction preprocessing and cascade SVMs with principal component analysis, locally linear embedding, and isometric mapping. The experimental analysis on various artificial and real-world benchmark problems includes various cascade specific parameters like intermediate training set sizes and dimensionalities.

Download Full-text