Second-Order SMO Improves SVM Online and Active Learning

2008 ◽  
Vol 20 (2) ◽  
pp. 374-382 ◽  
Author(s):  
Tobias Glasmachers ◽  
Christian Igel

Iterative learning algorithms that approximate the solution of support vector machines (SVMs) have two potential advantages. First, they allow online and active learning. Second, for large data sets, computing the exact SVM solution may be too time-consuming, and an efficient approximation can be preferable. The powerful LASVM iteratively approaches the exact SVM solution using sequential minimal optimization (SMO). It allows efficient online and active learning. Here, this algorithm is considerably improved in speed and accuracy by replacing the working set selection in the SMO steps. A second-order working set selection strategy, which greedily aims at maximizing the progress in each single step, is incorporated.

2021 ◽  
Vol 7 ◽  
pp. e799
Author(s):  
Zhenlong Sun ◽  
Jing Yang ◽  
Xiaoye Li ◽  
Jianpei Zhang

Support vector machine (SVM) is a robust machine learning method and is widely used in classification. However, the traditional SVM training methods may reveal personal privacy when the training data contains sensitive information. In the training process of SVMs, working set selection is a vital step for the sequential minimal optimization-type decomposition methods. To avoid complex sensitivity analysis and the influence of high-dimensional data on the noise of the existing SVM classifiers with privacy protection, we propose a new differentially private working set selection algorithm (DPWSS) in this paper, which utilizes the exponential mechanism to privately select working sets. We theoretically prove that the proposed algorithm satisfies differential privacy. The extended experiments show that the DPWSS algorithm achieves classification capability almost the same as the original non-privacy SVM under different parameters. The errors of optimized objective value between the two algorithms are nearly less than two, meanwhile, the DPWSS algorithm has a higher execution efficiency than the original non-privacy SVM by comparing iterations on different datasets. To the best of our knowledge, DPWSS is the first private working set selection algorithm based on differential privacy.


2008 ◽  
Vol 23 (4) ◽  
pp. 533-549 ◽  
Author(s):  
Yongqiao Wang ◽  
Xun Zhang ◽  
Souyang Wang ◽  
K.K. Lai

2021 ◽  
Author(s):  
Alberto Carlevaro

<div><div><div><p>The proposed paper addresses how Support Vector Data Description (SVDD) can be used to detect safety regions with zero statistical error. It provides a detailed methodology for the applicability of SVDD in real-life applications, such as Vehicle Platooning, by addressing common machine learning problems such as parameter tuning and handling large data sets. Also, intelligible analytics for knowledge extraction with rules is presented: it is targeted to understand safety regions of system parameters. Results are shown by feeding data through simulation to the train of different rule extraction mechanisms.</p></div></div></div>


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 52-53
Author(s):  
Ignacy Misztal

Abstract Early application of genomic selection relied on SNP estimation with phenotypes or de-regressed proofs (DRP). Chips of 50k SNP seemed sufficient. Estimated breeding value was an index with parent average and deduction to eliminate double counting. Use of SNP selection or weighting increased accuracy with small data sets but less or none with large data sets. Use of DRP with female information required ad-hoc modifications. As BLUP is biased by genomic selection, use of DRP under genomic selection required adjustments. Efforts to include potentially causative SNP derived from sequence analysis showed limited or no gain. The genomic selection was greatly simplified using single-step GBLUP (ssGBLUP) because the procedure automatically creates the index, can use any combination of male and female genotypes, and accounts for preselection. ssGBLUP requires careful scaling for compatibility between pedigree and genomic relationships to avoid biases especially under strong selection. Large data computations in ssGBLUP were solved by exploiting limited dimensionality of SNP due to limited effective population size. With such dimensionality ranging from 4k in chicken to about 15k in Holsteins, the inverse of GRM can be created directly (e.g., by the APY algorithm) in linear cost. Due to its simplicity and accuracy ssGBLUP is routinely used for genomic selection by major companies in chicken, pigs and beef. ssGBLUP can be used to derive SNP effects for indirect prediction, and for GWAS, including computations of the P-values. An alternative single-step called ssBR exists that uses SNP effects instead of GRM. As BLUP is affected by pre-selection, there is need for new validation procedures unaffected by selection, and for parameter estimation that accounts for all the genomic data used in selection. Another issue are reduced variances due to the Bulmer effect.


Sign in / Sign up

Export Citation Format

Share Document