Second-Order SMO Improves SVM Online and Active Learning

Tobias Glasmachers; Christian Igel

doi:10.1162/neco.2007.10-06-354

Second-Order SMO Improves SVM Online and Active Learning

Neural Computation ◽

10.1162/neco.2007.10-06-354 ◽

2008 ◽

Vol 20 (2) ◽

pp. 374-382 ◽

Cited By ~ 16

Author(s):

Tobias Glasmachers ◽

Christian Igel

Keyword(s):

Active Learning ◽

Large Data ◽

Single Step ◽

Second Order ◽

Selection Strategy ◽

Support Vector ◽

Data Sets ◽

Working Set Selection ◽

Working Set ◽

Efficient Approximation

Iterative learning algorithms that approximate the solution of support vector machines (SVMs) have two potential advantages. First, they allow online and active learning. Second, for large data sets, computing the exact SVM solution may be too time-consuming, and an efficient approximation can be preferable. The powerful LASVM iteratively approaches the exact SVM solution using sequential minimal optimization (SMO). It allows efficient online and active learning. Here, this algorithm is considerably improved in speed and accuracy by replacing the working set selection in the SMO steps. A second-order working set selection strategy, which greedily aims at maximizing the progress in each single step, is incorporated.

Support vector machine classification for large data sets via minimum enclosing ball clustering

Neurocomputing ◽

10.1016/j.neucom.2007.07.028 ◽

2008 ◽

Vol 71 (4-6) ◽

pp. 611-619 ◽

Cited By ~ 59

Author(s):

Jair Cervantes ◽

Xiaoou Li ◽

Wen Yu ◽

Kang Li

Keyword(s):

Support Vector Machine ◽

Large Data ◽

Large Data Sets ◽

Support Vector ◽

Data Sets ◽

Support Vector Machine Classification ◽

Minimum Enclosing Ball

Support Vector Machines on Large Data Sets: Simple Parallel Approaches

Studies in Classification, Data Analysis, and Knowledge Organization - Data Analysis, Machine Learning and Knowledge Discovery ◽

10.1007/978-3-319-01595-8_10 ◽

2013 ◽

pp. 87-95 ◽

Cited By ~ 5

Author(s):

Oliver Meyer ◽

Bernd Bischl ◽

Claus Weihs

Keyword(s):

Support Vector Machines ◽

Large Data ◽

Large Data Sets ◽

Support Vector ◽

Data Sets ◽

Vector Machines

DPWSS: differentially private working set selection for training support vector machines

PeerJ Computer Science ◽

10.7717/peerj-cs.799 ◽

2021 ◽

Vol 7 ◽

pp. e799

Author(s):

Zhenlong Sun ◽

Jing Yang ◽

Xiaoye Li ◽

Jianpei Zhang

Keyword(s):

Differential Privacy ◽

Training Data ◽

Support Vector ◽

Sensitive Information ◽

Training Methods ◽

Selection Algorithm ◽

Personal Privacy ◽

Training Support ◽

Working Set Selection ◽

Working Set

Support vector machine (SVM) is a robust machine learning method and is widely used in classification. However, the traditional SVM training methods may reveal personal privacy when the training data contains sensitive information. In the training process of SVMs, working set selection is a vital step for the sequential minimal optimization-type decomposition methods. To avoid complex sensitivity analysis and the influence of high-dimensional data on the noise of the existing SVM classifiers with privacy protection, we propose a new differentially private working set selection algorithm (DPWSS) in this paper, which utilizes the exponential mechanism to privately select working sets. We theoretically prove that the proposed algorithm satisfies differential privacy. The extended experiments show that the DPWSS algorithm achieves classification capability almost the same as the original non-privacy SVM under different parameters. The errors of optimized objective value between the two algorithms are nearly less than two, meanwhile, the DPWSS algorithm has a higher execution efficiency than the original non-privacy SVM by comparing iterations on different datasets. To the best of our knowledge, DPWSS is the first private working set selection algorithm based on differential privacy.

Support Vector Machine Classification Based on Fuzzy Clustering for Large Data Sets

Lecture Notes in Computer Science - MICAI 2006: Advances in Artificial Intelligence ◽

10.1007/11925231_54 ◽

2006 ◽

pp. 572-582 ◽

Cited By ~ 18

Author(s):

Jair Cervantes ◽

Xiaoou Li ◽

Wen Yu

Keyword(s):

Support Vector Machine ◽

Fuzzy Clustering ◽

Large Data ◽

Large Data Sets ◽

Support Vector ◽

Data Sets ◽

Support Vector Machine Classification

Nonlinear clustering-based support vector machine for large data sets

Optimization Methods and Software ◽

10.1080/10556780802102453 ◽

2008 ◽

Vol 23 (4) ◽

pp. 533-549 ◽

Cited By ~ 1

Author(s):

Yongqiao Wang ◽

Xun Zhang ◽

Souyang Wang ◽

K.K. Lai

Keyword(s):

Support Vector Machine ◽

Large Data ◽

Large Data Sets ◽

Support Vector ◽

Data Sets

Multi-Class Support Vector Machines for Large Data Sets via Minimum Enclosing Ball Clustering

2007 4th International Conference on Electrical and Electronics Engineering ◽

10.1109/iceee.2007.4344994 ◽

2007 ◽

Cited By ~ 2

Author(s):

Jair Cervantes ◽

Xiaoou Li ◽

Wen Yu ◽

Javier Bejarano

Keyword(s):

Support Vector Machines ◽

Large Data ◽

Large Data Sets ◽

Support Vector ◽

Data Sets ◽

Vector Machines ◽

Minimum Enclosing Ball

Reliable AI through SVDD and rule extraction

10.36227/techrxiv.14618088 ◽

2021 ◽

Author(s):

Alberto Carlevaro

Keyword(s):

Real Life ◽

Statistical Error ◽

Large Data ◽

Parameter Tuning ◽

Rule Extraction ◽

Learning Problems ◽

Support Vector ◽

Support Vector Data Description ◽

Data Sets ◽

Vehicle Platooning

<div><div><div><p>The proposed paper addresses how Support Vector Data Description (SVDD) can be used to detect safety regions with zero statistical error. It provides a detailed methodology for the applicability of SVDD in real-life applications, such as Vehicle Platooning, by addressing common machine learning problems such as parameter tuning and handling large data sets. Also, intelligible analytics for knowledge extraction with rules is presented: it is targeted to understand safety regions of system parameters. Results are shown by feeding data through simulation to the train of different rule extraction mechanisms.</p></div></div></div>

Fast Support Vector Machine Classification for Large Data Sets

International Journal of Computational Intelligence Systems ◽

10.1080/18756891.2013.868148 ◽

2013 ◽

Vol 7 (2) ◽

pp. 197-212 ◽

Cited By ~ 4

Author(s):

Xiaoou Li ◽

Wen Yu

Keyword(s):

Support Vector Machine ◽

Large Data ◽

Large Data Sets ◽

Support Vector ◽

Data Sets ◽

Support Vector Machine Classification

Two-stage incremental working set selection for fast support vector training on large datasets

2008 IEEE International Conference on Research, Innovation and Vision for the Future in Computing and Communication Technologies ◽

10.1109/rivf.2008.4586359 ◽

2008 ◽

Cited By ~ 2

Author(s):

DucDung Nguyen ◽

Kazunori Matsumoto ◽

Yasuhiro Takishima ◽

Kazuo Hashimoto ◽

Masahiro Terabe

Keyword(s):

Large Datasets ◽

Support Vector ◽

Two Stage ◽

Selection For ◽

Working Set Selection ◽

Working Set

49 Current status of genomic selection

Journal of Animal Science ◽

10.1093/jas/skz258.105 ◽

2019 ◽

Vol 97 (Supplement_3) ◽

pp. 52-53

Author(s):

Ignacy Misztal

Keyword(s):

Genomic Selection ◽

Ad Hoc ◽

Large Data ◽

Single Step ◽

Breeding Value ◽

Current Status ◽

Small Data ◽

Data Sets ◽

Effective Population ◽

Early Application

Abstract Early application of genomic selection relied on SNP estimation with phenotypes or de-regressed proofs (DRP). Chips of 50k SNP seemed sufficient. Estimated breeding value was an index with parent average and deduction to eliminate double counting. Use of SNP selection or weighting increased accuracy with small data sets but less or none with large data sets. Use of DRP with female information required ad-hoc modifications. As BLUP is biased by genomic selection, use of DRP under genomic selection required adjustments. Efforts to include potentially causative SNP derived from sequence analysis showed limited or no gain. The genomic selection was greatly simplified using single-step GBLUP (ssGBLUP) because the procedure automatically creates the index, can use any combination of male and female genotypes, and accounts for preselection. ssGBLUP requires careful scaling for compatibility between pedigree and genomic relationships to avoid biases especially under strong selection. Large data computations in ssGBLUP were solved by exploiting limited dimensionality of SNP due to limited effective population size. With such dimensionality ranging from 4k in chicken to about 15k in Holsteins, the inverse of GRM can be created directly (e.g., by the APY algorithm) in linear cost. Due to its simplicity and accuracy ssGBLUP is routinely used for genomic selection by major companies in chicken, pigs and beef. ssGBLUP can be used to derive SNP effects for indirect prediction, and for GWAS, including computations of the P-values. An alternative single-step called ssBR exists that uses SNP effects instead of GRM. As BLUP is affected by pre-selection, there is need for new validation procedures unaffected by selection, and for parameter estimation that accounts for all the genomic data used in selection. Another issue are reduced variances due to the Bulmer effect.