A HYBRID SVM BASED ON NEAREST NEIGHBOR RULE

JIE JI; QIANGFU ZHAO

doi:10.1142/s0219691313500483

A HYBRID SVM BASED ON NEAREST NEIGHBOR RULE

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691313500483 ◽

2013 ◽

Vol 11 (06) ◽

pp. 1350048 ◽

Cited By ~ 3

Author(s):

JIE JI ◽

QIANGFU ZHAO

Keyword(s):

Boundary Data ◽

Nearest Neighbor ◽

Support Vector ◽

Svm Classifier ◽

Data Sets ◽

Support Vectors ◽

Vector Machines ◽

Nearest Neighbor Rule ◽

Data Points ◽

Speed Up

This paper proposes a hybrid learning method to speed up the classification procedure of Support Vector Machines (SVM). Comparing most algorithms trying to decrease the support vectors in an SVM classifier, we focus on reducing the data points that need SVM for classification, and reduce the number of support vectors for each SVM classification. The system uses a Nearest Neighbor Classifier (NNC) to treat data points attentively. In the training phase, the NNC selects data near partial decision boundary, and then trains sub SVM for each Voronoi pair. For classification, most non-boundary data points are classified by NNC directly, while remaining boundary data points are passed to a corresponding local expert SVM. We also propose a data selection method for training reliable expert SVM. Experimental results on several generated and public machine learning data sets show that the proposed method significantly accelerates the testing speed.

Download Full-text

An Incremental Isomap Method for Hyperspectral Dimensionality Reduction and Classification

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.87.7.445 ◽

2021 ◽

Vol 87 (6) ◽

pp. 445-455

Author(s):

Yi Ma ◽

Zezhong Zheng ◽

Yutang Ma ◽

Mingcang Zhu ◽

Ran Huang ◽

...

Keyword(s):

Manifold Learning ◽

Nearest Neighbor ◽

Hyperspectral Image ◽

Hyperspectral Data ◽

Training Data ◽

Support Vector ◽

Data Sets ◽

K Nearest Neighbor ◽

Data Set ◽

Data Points

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.

Download Full-text

A comparative analysis of structural risk minimization by support vector machines and nearest neighbor rule

Pattern Recognition Letters ◽

10.1016/j.patrec.2003.09.002 ◽

2004 ◽

Vol 25 (1) ◽

pp. 63-71 ◽

Cited By ~ 19

Author(s):

Bilge Karaçalı ◽

Rajeev Ramanath ◽

Wesley E. Snyder

Keyword(s):

Comparative Analysis ◽

Support Vector Machines ◽

Nearest Neighbor ◽

Support Vector ◽

Risk Minimization ◽

Structural Risk Minimization ◽

Vector Machines ◽

Nearest Neighbor Rule ◽

Structural Risk

Download Full-text

Using a Nearest Neighbor Rule for the Clustering Method Based on One-Class Support Vector Machines

2012 International Conference on Computer Science and Service System ◽

10.1109/csss.2012.514 ◽

2012 ◽

Author(s):

Lei Gu

Keyword(s):

Support Vector Machines ◽

Nearest Neighbor ◽

Support Vector ◽

Clustering Method ◽

Vector Machines ◽

Nearest Neighbor Rule

Download Full-text

Effectiveness of Support Vector Machines in Medical Data mining

Journal of Communications Software and Systems ◽

10.24138/jcomss.v11i1.114 ◽

2015 ◽

Vol 11 (1) ◽

pp. 25 ◽

Cited By ~ 14

Author(s):

Padmavathi Janardhanan ◽

Heena L. ◽

Fathima Sabika

Keyword(s):

Data Mining ◽

Medical Data ◽

Support Vector ◽

Svm Classifier ◽

Data Sets ◽

Medical Data Mining ◽

Rbf Network ◽

Vector Machines ◽

Hidden Knowledge ◽

Using Data

The idea of medical data mining is to extract hidden knowledge in medical field using data mining techniques. One of the positive aspects is to discover the important patterns. It is possible to identify patterns even if we do not have fully understood the casual mechanisms behind those patterns. In this case, data mining prepares the ability of research and discovery that may not have been evident. This paper analyzes the effectiveness of SVM, the most popular classification techniques in classifying medical datasets. This paper analyses the performance of the Naïve Bayes classifier, RBF network and SVM Classifier. The performance of predictive model is analysed with different medical datasets in predicting diseases is recorded and compared. The datasets were of binary class and each dataset had different number of attributes. The datasets include heart datasets, cancer and diabetes datasets. It is observed that SVM classifier produces better percentage of accuracy in classification. The work has been implemented in WEKA environment and obtained results show that SVM is the most robust and effective classifier for medical data sets.

Download Full-text

AccGenSVM: Selectively Transferring from Previous Hypotheses

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/199 ◽

2017 ◽

Cited By ~ 1

Author(s):

Diana Benavides-Prado ◽

Yun Sing Koh ◽

Patricia Riddle

Keyword(s):

Transfer Learning ◽

Support Vector ◽

Support Vectors ◽

Vector Machines ◽

Source Data ◽

Data Points ◽

Learning Scenarios ◽

Novel Method ◽

Similar Accuracy ◽

Target Data

In our research, we consider transfer learning scenarios where a target learner does not have access to the source data, but instead to hypotheses or models induced from it. This is called the Hypothesis Transfer Learning (HTL) problem. Previous approaches concentrated on transferring source hypotheses as a whole. We introduce a novel method for selectively transferring elements from previous hypotheses learned with Support Vector Machines. The representation of an SVM hypothesis as a set of support vectors allows us to treat this information as privileged to aid learning during a new task. Given a possibly large number of source hypotheses, our approach selects the source support vectors that more closely resemble the target data, and transfers their learned coefficients as constraints on the coefficients to be learned. This strategy increases the importance of relevant target data points based on their similarity to source support vectors, while learning from the target data. Our method shows important improvements on the convergence rate on three classification datasets of varying sizes, decreasing the number of iterations by up to 56% on average compared to learning with no transfer and up to 92% compared to regular HTL, while maintaining similar accuracy levels.

Download Full-text

SUBiNN: a stacked uni- and bivariate kNN sparse ensemble

Advances in Data Analysis and Classification ◽

10.1007/s11634-021-00462-7 ◽

2021 ◽

Author(s):

Tiffany Elsten ◽

Mark de Rooij

Keyword(s):

Random Forests ◽

Nearest Neighbor ◽

Ensemble Methods ◽

Predictive Performance ◽

Ensemble Classifier ◽

Support Vector ◽

Data Sets ◽

Vector Machines ◽

Lasso Method ◽

Nearest Neighbor Classifiers

AbstractNearest Neighbor classification is an intuitive distance-based classification method. It has, however, two drawbacks: (1) it is sensitive to the number of features, and (2) it does not give information about the importance of single features or pairs of features. In stacking, a set of base-learners is combined in one overall ensemble classifier by means of a meta-learner. In this manuscript we combine univariate and bivariate nearest neighbor classifiers that are by itself easily interpretable. Furthermore, we combine these classifiers by a Lasso method that results in a sparse ensemble of nonlinear main and pairwise interaction effects. We christened the new method SUBiNN: Stacked Uni- and Bivariate Nearest Neighbors. SUBiNN overcomes the two drawbacks of simple nearest neighbor methods. In extensive simulations and using benchmark data sets, we evaluate the predictive performance of SUBiNN and compare it to other nearest neighbor ensemble methods as well as Random Forests and Support Vector Machines. Results indicate that SUBiNN often outperforms other nearest neighbor methods, that SUBiNN is well capable of identifying noise features, but that Random Forests is often, but not always, the best classifier.

Download Full-text

A Comparative Analysis of One-class Structural Risk Minimization by Support Vector Machines and Nearest Neighbor Rule

Artificial Intelligence in Theory and Practice II - IFIP – The International Federation for Information Processing ◽

10.1007/978-0-387-09695-7_24 ◽

2008 ◽

pp. 245-254 ◽

Cited By ~ 2

Author(s):

George G. Cabral ◽

Adriano L. I. Oliveira

Keyword(s):

Comparative Analysis ◽

Support Vector Machines ◽

Nearest Neighbor ◽

Support Vector ◽

Risk Minimization ◽

Structural Risk Minimization ◽

Vector Machines ◽

Nearest Neighbor Rule ◽

Structural Risk

Download Full-text

Randomized kernel methods for least-squares support vector machines

International Journal of Modern Physics C ◽

10.1142/s0129183117500152 ◽

2017 ◽

Vol 28 (02) ◽

pp. 1750015 ◽

Cited By ~ 1

Author(s):

M. Andrecut

Keyword(s):

Least Squares ◽

Kernel Method ◽

Large Data ◽

Large Data Sets ◽

Support Vector ◽

Svm Classifier ◽

Data Sets ◽

Classification Problems ◽

Vector Machines ◽

Multi Class Classification

The least-squares support vector machine (LS-SVM) is a frequently used kernel method for non-linear regression and classification tasks. Here we discuss several approximation algorithms for the LS-SVM classifier. The proposed methods are based on randomized block kernel matrices, and we show that they provide good accuracy and reliable scaling for multi-class classification problems with relatively large data sets. Also, we present several numerical experiments that illustrate the practical applicability of the proposed methods.

Download Full-text

A Comparison of Scaling Methods to Obtain Calibrated Probabilities of Activity for Ligand-Target Predictions

10.26434/chemrxiv.11526132 ◽

2020 ◽

Author(s):

Lewis Mervin ◽

Avid M. Afzal ◽

Ola Engkvist ◽

Andreas Bender

Keyword(s):

Target Prediction ◽

Support Vector ◽

Machine Learning Method ◽

Learning Method ◽

Protein Target ◽

Bioactivity Prediction ◽

Vector Machines ◽

Scaling Methods ◽

Data Points ◽

Compound Target

In the context of bioactivity prediction, the question of how to calibrate a score produced by a machine learning method into reliable probability of binding to a protein target is not yet satisfactorily addressed. In this study, we compared the performance of three such methods, namely Platt Scaling, Isotonic Regression and Venn-ABERS in calibrating prediction scores for ligand-target prediction comprising the Naïve Bayes, Support Vector Machines and Random Forest algorithms with bioactivity data available at AstraZeneca (40 million data points (compound-target pairs) across 2112 targets). Performance was assessed using Stratified Shuffle Split (SSS) and Leave 20% of Scaffolds Out (L20SO) validation.

Download Full-text

Nondegenerate Piecewise Linear Systems: A Finite Newton Algorithm and Applications in Machine Learning

Neural Computation ◽

10.1162/neco_a_00241 ◽

2012 ◽

Vol 24 (4) ◽

pp. 1047-1084 ◽

Cited By ~ 2

Author(s):

Xiao-Tong Yuan ◽

Shuicheng Yan

Keyword(s):

Linear Systems ◽

Optimization Problems ◽

Piecewise Linear ◽

Optimization Methods ◽

Coefficient Matrix ◽

Learning Problems ◽

Support Vector ◽

Data Sets ◽

Piecewise Linear Systems ◽

Vector Machines

We investigate Newton-type optimization methods for solving piecewise linear systems (PLSs) with nondegenerate coefficient matrix. Such systems arise, for example, from the numerical solution of linear complementarity problem, which is useful to model several learning and optimization problems. In this letter, we propose an effective damped Newton method, PLS-DN, to find the exact (up to machine precision) solution of nondegenerate PLSs. PLS-DN exhibits provable semiiterative property, that is, the algorithm converges globally to the exact solution in a finite number of iterations. The rate of convergence is shown to be at least linear before termination. We emphasize the applications of our method in modeling, from a novel perspective of PLSs, some statistical learning problems such as box-constrained least squares, elitist Lasso (Kowalski & Torreesani, 2008 ), and support vector machines (Cortes & Vapnik, 1995 ). Numerical results on synthetic and benchmark data sets are presented to demonstrate the effectiveness and efficiency of PLS-DN on these problems.

Download Full-text