Toward Accelerated Training of Parallel Support Vector Machines Based on Voronoi Diagrams

Typical applications of wireless sensor networks (WSN), such as in Industry 4.0 and smart cities, involves acquiring and processing large amounts of data in federated systems. Important challenges arise for machine learning algorithms in this scenario, such as reducing energy consumption and minimizing data exchange between devices in different zones. This paper introduces a novel method for accelerated training of parallel Support Vector Machines (pSVMs), based on ensembles, tailored to these kinds of problems. To achieve this, the training set is split into several Voronoi regions. These regions are small enough to permit faster parallel training of SVMs, reducing computational payload. Results from experiments comparing the proposed method with a single SVM and a standard ensemble of SVMs demonstrate that this approach can provide comparable performance while limiting the number of regions required to solve classification tasks. These advantages facilitate the development of energy-efficient policies in WSN.

Download Full-text

Research on Parallel Support Vector Machine Based on Spark Big Data Platform

Scientific Programming ◽

10.1155/2021/7998417 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Yao Huimin

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Big Data ◽

Support Vector Machines ◽

Cross Validation ◽

Machine Learning Algorithms ◽

Support Vector ◽

Lambda Architecture ◽

Vector Machines ◽

Data Platform

With the development of cloud computing and distributed cluster technology, the concept of big data has been expanded and extended in terms of capacity and value, and machine learning technology has also received unprecedented attention in recent years. Traditional machine learning algorithms cannot solve the problem of effective parallelization, so a parallelization support vector machine based on Spark big data platform is proposed. Firstly, the big data platform is designed with Lambda architecture, which is divided into three layers: Batch Layer, Serving Layer, and Speed Layer. Secondly, in order to improve the training efficiency of support vector machines on large-scale data, when merging two support vector machines, the “special points” other than support vectors are considered, that is, the points where the nonsupport vectors in one subset violate the training results of the other subset, and a cross-validation merging algorithm is proposed. Then, a parallelized support vector machine based on cross-validation is proposed, and the parallelization process of the support vector machine is realized on the Spark platform. Finally, experiments on different datasets verify the effectiveness and stability of the proposed method. Experimental results show that the proposed parallelized support vector machine has outstanding performance in speed-up ratio, training time, and prediction accuracy.

Download Full-text

TEXTURE SEGMENTATION USING SEMI-SUPERVISED SUPPORT VECTOR MACHINES

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026804001197 ◽

2004 ◽

Vol 04 (02) ◽

pp. 131-142 ◽

Cited By ~ 1

Author(s):

SAEID SANEI

Keyword(s):

Support Vector Machines ◽

High Spatial Frequency ◽

Misclassification Error ◽

Support Vector ◽

Frequency Noise ◽

Multiple Constraints ◽

Training Set ◽

Vector Machines ◽

Segmentation Algorithms ◽

A Minor

Segmentation of natural textures has been investigated by developing a novel semi-supervised support vector machines (S3VM) algorithm with multiple constraints. Unlike conventional segmentation algorithms the proposed method does not classify the textures but classifies the uniform-texture regions and the regions of boundaries. Also the overall algorithm does not use any training set as used by all other learning algorithms such as conventional SVMs. During the process, the images are restored from high spatial frequency noise. Then various-order statistics of the textures within a sliding two-dimensional window are measured. K-mean algorithm is used to initialise the clustering procedure by labelling part of the class members and the classifier parameters. Therefore at this stage we have both the training and the working sets. A non-linear S3VM is then developed to exploit both sets to classify all the regions. The convex algorithm maximises a defined cost function by incorporating a number of constraints. The algorithm has been applied to combinations of a number of natural textures. It is demonstrated that the algorithm is robust, with negligible misclassification error. However, for complex textures there may be a minor misplacement of the edges.

Download Full-text

Cascade Support Vector Machines with Dimensionality Reduction

Applied Computational Intelligence and Soft Computing ◽

10.1155/2015/216132 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Oliver Kramer

Keyword(s):

Support Vector Machines ◽

Dimensionality Reduction ◽

Principal Component ◽

Large Data ◽

Locally Linear Embedding ◽

Benchmark Problems ◽

Support Vector ◽

Training Set ◽

Support Vectors ◽

Vector Machines

Cascade support vector machines have been introduced as extension of classic support vector machines that allow a fast training on large data sets. In this work, we combine cascade support vector machines with dimensionality reduction based preprocessing. The cascade principle allows fast learning based on the division of the training set into subsets and the union of cascade learning results based on support vectors in each cascade level. The combination with dimensionality reduction as preprocessing results in a significant speedup, often without loss of classifier accuracies, while considering the high-dimensional pendants of the low-dimensional support vectors in each new cascade level. We analyze and compare various instantiations of dimensionality reduction preprocessing and cascade SVMs with principal component analysis, locally linear embedding, and isometric mapping. The experimental analysis on various artificial and real-world benchmark problems includes various cascade specific parameters like intermediate training set sizes and dimensionalities.

Download Full-text

Combination with Machine Learning Algorithms for the Classification in E-Bussiness

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.230-232.625 ◽

2011 ◽

Vol 230-232 ◽

pp. 625-628

Author(s):

Lei Shi ◽

Xin Ming Ma ◽

Xiao Hong Hu

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Machine Learning Algorithms ◽

Classification Model ◽

Support Vector ◽

Mathematical Tool ◽

Vector Machines

E-bussiness has grown rapidly in the last decade and massive amount of data on customer purchases, browsing pattern and preferences has been generated. Classification of electronic data plays a pivotal role to mine the valuable information and thus has become one of the most important applications of E-bussiness. Support Vector Machines are popular and powerful machine learning techniques, and they offer state-of-the-art performance. Rough set theory is a formal mathematical tool to deal with incomplete or imprecise information and one of its important applications is feature selection. In this paper, rough set theory and support vector machines are combined to construct a classification model to classify the data of E-bussiness effectively.

Download Full-text

Linear Support Vector Machines for Prediction of Student Performance in School-Based Education

Mathematical Problems in Engineering ◽

10.1155/2020/4761468 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Nalindren Naicker ◽

Timothy Adeliyi ◽

Jeanette Wing

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Student Performance ◽

State Of The Art ◽

Learning Algorithms ◽

The State ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Support Vector ◽

Vector Machines

Educational Data Mining (EDM) is a rich research field in computer science. Tools and techniques in EDM are useful to predict student performance which gives practitioners useful insights to develop appropriate intervention strategies to improve pass rates and increase retention. The performance of the state-of-the-art machine learning classifiers is very much dependent on the task at hand. Investigating support vector machines has been used extensively in classification problems; however, the extant of literature shows a gap in the application of linear support vector machines as a predictor of student performance. The aim of this study was to compare the performance of linear support vector machines with the performance of the state-of-the-art classical machine learning algorithms in order to determine the algorithm that would improve prediction of student performance. In this quantitative study, an experimental research design was used. Experiments were set up using feature selection on a publicly available dataset of 1000 alpha-numeric student records. Linear support vector machines benchmarked with ten categorical machine learning algorithms showed superior performance in predicting student performance. The results of this research showed that features like race, gender, and lunch influence performance in mathematics whilst access to lunch was the primary factor which influences reading and writing performance.

Download Full-text

Mobile Money Fraud Prediction—A Cross-Case Analysis on the Efficiency of Support Vector Machines, Gradient Boosted Decision Trees, and Naïve Bayes Algorithms

Information ◽

10.3390/info11080383 ◽

2020 ◽

Vol 11 (8) ◽

pp. 383

Author(s):

Francis Effirim Botchey ◽

Zhen Qin ◽

Kwesi Hughes-Lartey

Keyword(s):

Developing Countries ◽

Support Vector Machines ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

Mobile Money ◽

Vector Machines ◽

Boosted Decision Tree

The onset of COVID-19 has re-emphasized the importance of FinTech especially in developing countries as the major powers of the world are already enjoying the advantages that come with the adoption of FinTech. Handling of physical cash has been established as a means of transmitting the novel corona virus. Again, research has established that, been unbanked raises the potential of sinking one into abject poverty. Over the years, developing countries have been piloting the various forms of FinTech, but the very one that has come to stay is the Mobile Money Transactions (MMT). As mobile money transactions attempt to gain a foothold, it faces several problems, the most important of them is mobile money fraud. This paper seeks to provide a solution to this problem by looking at machine learning algorithms based on support vector machines (kernel-based), gradient boosted decision tree (tree-based) and Naïve Bayes (probabilistic based) algorithms, taking into consideration the imbalanced nature of the dataset. Our experiments showed that the use of gradient boosted decision tree holds a great potential in combating the problem of mobile money fraud as it was able to produce near perfect results.

Download Full-text

LOCAL DESCRIPTOR MATCHING WITH SUPPORT VECTOR MACHINES

International Journal of Information Acquisition ◽

10.1142/s0219878910002051 ◽

2010 ◽

Vol 07 (01) ◽

pp. 59-80

Author(s):

D. CHENG ◽

S. Q. XIE ◽

E. HÄMMERLE

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Local Descriptor ◽

Local Descriptors ◽

Vector Machines ◽

Three Stages ◽

Image Transformations

Local descriptor matching is the most overlooked stage of the three stages of the local descriptor process, and this paper proposes a new method for matching local descriptors based on support vector machines. Results from experiments show that the developed method is more robust for matching local descriptors for all image transformations considered. The method is able to be integrated with different local descriptor methods, and with different machine learning algorithms and this shows that the approach is sufficiently robust and versatile.

Download Full-text

Properties of Support Vector Machines

Neural Computation ◽

10.1162/089976698300017575 ◽

1998 ◽

Vol 10 (4) ◽

pp. 955-974 ◽

Cited By ~ 97

Author(s):

Massimiliano Pontil ◽

Alessandro Verri

Keyword(s):

Support Vector Machines ◽

Finite Dimension ◽

Regularization Parameter ◽

Feature Space ◽

Support Vector ◽

Training Set ◽

Mathematical Properties ◽

Support Vectors ◽

Vector Machines ◽

Almost All

Support vector machines (SVMs) perform pattern recognition between two point classes by finding a decision surface determined by certain points of the training set, termed support vectors (SV). This surface, which in some feature space of possibly infinite dimension can be regarded as a hyperplane, is obtained from the solution of a problem of quadratic programming that depends on a regularization parameter. In this article, we study some mathematical properties of support vectors and show that the decision surface can be written as the sum of two orthogonal terms, the first depending on only the margin vectors (which are SVs lying on the margin), the second proportional to the regularization parameter. For almost all values of the parameter, this enables us to predict how the decision surface varies for small parameter changes. In the special but important case of feature space of finite dimension m, we also show that there are at most m + 1 margin vectors and observe that m + 1 SVs are usually sufficient to determine the decision surface fully. For relatively small m, this latter result leads to a consistent reduction of the SV number.

Download Full-text

Sentiment Analysis of Student’s Opinion on Programming Assessment: Evaluation of Naïve Bayes over Support Vector Machines

International Journal of Innovative Computing ◽

10.11113/ijic.v10n2.278 ◽

2020 ◽

Vol 10 (2) ◽

Author(s):

Mahmood Umar ◽

Nor Bahiah Ahmad ◽

Anazida Zainal

Keyword(s):

Support Vector Machines ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Experimental Result ◽

Support Vector ◽

Small Data ◽

Data Set ◽

Vector Machines

This study investigates the performance of machine learning algorithms for sentiment analysis of students’ opinions on programming assessment. Previous researches show that Support Vector Machines (SVM) performs the best among all techniques, followed by Naïve Bayes (NB) in sentiment analysis. This study proposes a framework for classifying sentiments, as positive or negative using NB algorithm and Lexicon-based approach on small data set. The performance of NB algorithm was evaluated using SVM. NB and SVM conquer the Lexicon-based approach opinion lexicon technique in terms of accuracy in the specific area for which it is trained. The Lexicon-based technique, on the other hand, avoids difficult steps needed to train the classifier. Data was analyzed from 75 first year undergraduate students in School of Computing, Universiti Teknologi Malaysia taking programming subject. The student’s sentiments were gathered based on their opinions for the zero-score policy for unsuccessful compilation of program during skill-based test. The result of the study reveals that the students tend to have negative sentiments on programming assessment as it gives them scary emotions. The experimental result of applying NB algorithm yields a prediction accuracy of 85% which outperform both the SVM with 70% and Lexicon-based approach with 60% accuracy. The result shows that NB works better than SVM and Lexicon-based approach on small dataset.

Download Full-text

Machine Learning Prediction of Hospitalization due to COVID-19 based on Self-Reported Symptoms: A Study for Brazil*

10.36227/techrxiv.13736698 ◽

2021 ◽

Author(s):

Igor Miranda ◽

Gildeberto Cardoso ◽

Madhurananda Pahar ◽

Gabriel Oliveira ◽

Thomas Niesler

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Support Vector Machines ◽

Health Professionals ◽

Human Development Index ◽

Mobile App ◽

Machine Learning Algorithms ◽

Support Vector ◽

Vector Machines ◽

Timely Treatment

Predicting the need for hospitalization due to COVID-19 may help patients to seek timely treatment and assist health professionals to monitor cases and allocate resources. We investigate the use of machine learning algorithms to predict the risk of hospitalization due to COVID-19 using the patient's medical history and self-reported symptoms, regardless of the period in which they occurred. Three datasets containing information regarding 217,580 patients from three different states in Brazil have been used. Decision trees, neural networks, and support vector machines were evaluated, achieving accuracies between 79.1% to 84.7%. Our analysis shows that better performance is achieved in Brazilian states ranked more highly in terms of the official human development index (HDI), suggesting that health facilities with better infrastructure generate data that is less noisy. One of the models developed in this study has been incorporated into a mobile app that is available for public use.

Download Full-text