scholarly journals Parallel Implementation on FPGA of Support Vector Machines Using Stochastic Gradient Descent

Electronics ◽  
2019 ◽  
Vol 8 (6) ◽  
pp. 631 ◽  
Author(s):  
Felipe F. Lopes ◽  
João Canas Ferreira ◽  
Marcelo A. C. Fernandes

Sequential Minimal Optimization (SMO) is the traditional training algorithm for Support Vector Machines (SVMs). However, SMO does not scale well with the size of the training set. For that reason, Stochastic Gradient Descent (SGD) algorithms, which have better scalability, are a better option for massive data mining applications. Furthermore, even with the use of SGD, training times can become extremely large depending on the data set. For this reason, accelerators such as Field-programmable Gate Arrays (FPGAs) are used. This work describes an implementation in hardware, using FPGA, of a fully parallel SVM using Stochastic Gradient Descent. The proposed FPGA implementation of an SVM with SGD presents speedups of more than 10,000× relative to software implementations running on a quad-core processor and up to 319× compared to state-of-the-art FPGA implementations while requiring fewer hardware resources. The results show that the proposed architecture is a viable solution for highly demanding problems such as those present in big data analysis.

2020 ◽  
Vol 15 (3) ◽  
pp. 393-409
Author(s):  
Raluca Dana Caplescu ◽  
Ana-Maria Panaite ◽  
Daniel Traian Pele ◽  
Vasile Alecsandru Strat

AbstractRecent increase in peer-to-peer lending prompted for development of models to separate good and bad clients to mitigate risks both for lenders and for the platforms. The rapidly increasing body of literature provides several comparisons between various models. Among the most frequently employed ones are logistic regression, Support Vector Machines, neural networks and decision tree-based models. Among them, logistic regression has proved to be a strong candidate both because its good performance and due to its high explainability. The present paper aims to compare four pairs of models (for imbalanced and under-sampled data) meant to predict charged off clients by optimizing F1 score. We found that, if the data is balanced, Logistic Regression, both simple and with Stochastic Gradient Descent, outperforms LightGBM and K-Nearest Neighbors in optimizing F1 score. We chose this metric as it provides balance between the interests of the lenders and those of the platform. Loan term, debt-to-income ratio and number of accounts were found to be important positively related predictors of risk of charge off. At the other end of the spectrum, by far the strongest impact on charge off probability is that of the FICO score. The final number of features retained by the two models differs very much, because, although both models use Lasso for feature selection, Stochastic Gradient Descent Logistic Regression uses a stronger regularization. The analysis was performed using Python (numpy, pandas, sklearn and imblearn).


2020 ◽  
Vol 4 (2) ◽  
pp. 329-335
Author(s):  
Rusydi Umar ◽  
Imam Riadi ◽  
Purwono

The failure of most startups in Indonesia is caused by team performance that is not solid and competent. Programmers are an integral profession in a startup team. The development of social media can be used as a strategic tool for recruiting the best programmer candidates in a company. This strategic tool is in the form of an automatic classification system of social media posting from prospective programmers. The classification results are expected to be able to predict the performance patterns of each candidate with a predicate of good or bad performance. The classification method with the best accuracy needs to be chosen in order to get an effective strategic tool so that a comparison of several methods is needed. This study compares classification methods including the Support Vector Machines (SVM) algorithm, Random Forest (RF) and Stochastic Gradient Descent (SGD). The classification results show the percentage of accuracy with k = 10 cross validation for the SVM algorithm reaches 81.3%, RF at 74.4%, and SGD at 80.1% so that the SVM method is chosen as a model of programmer performance classification on social media activities.


Author(s):  
Mohammad Reza Daliri

AbstractIn this article, we propose a feature selection strategy using a binary particle swarm optimization algorithm for the diagnosis of different medical diseases. The support vector machines were used for the fitness function of the binary particle swarm optimization. We evaluated our proposed method on four databases from the machine learning repository, including the single proton emission computed tomography heart database, the Wisconsin breast cancer data set, the Pima Indians diabetes database, and the Dermatology data set. The results indicate that, with selected less number of features, we obtained a higher accuracy in diagnosing heart, cancer, diabetes, and erythematosquamous diseases. The results were compared with the traditional feature selection methods, namely, the F-score and the information gain, and a superior accuracy was obtained with our method. Compared to the genetic algorithm for feature selection, the results of the proposed method show a higher accuracy in all of the data, except in one. In addition, in comparison with other methods that used the same data, our approach has a higher performance using less number of features.


Kybernetes ◽  
2014 ◽  
Vol 43 (8) ◽  
pp. 1150-1164 ◽  
Author(s):  
Bilal M’hamed Abidine ◽  
Belkacem Fergani ◽  
Mourad Oussalah ◽  
Lamya Fergani

Purpose – The task of identifying activity classes from sensor information in smart home is very challenging because of the imbalanced nature of such data set where some activities occur more frequently than others. Typically probabilistic models such as Hidden Markov Model (HMM) and Conditional Random Fields (CRF) are known as commonly employed for such purpose. The paper aims to discuss these issues. Design/methodology/approach – In this work, the authors propose a robust strategy combining the Synthetic Minority Over-sampling Technique (SMOTE) with Cost Sensitive Support Vector Machines (CS-SVM) with an adaptive tuning of cost parameter in order to handle imbalanced data problem. Findings – The results have demonstrated the usefulness of the approach through comparison with state of art of approaches including HMM, CRF, the traditional C-Support vector machines (C-SVM) and the Cost-Sensitive-SVM (CS-SVM) for classifying the activities using binary and ubiquitous sensors. Originality/value – Performance metrics in the experiment/simulation include Accuracy, Precision/Recall and F measure.


Sign in / Sign up

Export Citation Format

Share Document