A Selective Sampling Method for Imbalanced Data Learning on Support Vector Machines

Imbalanced data learning is one of the most active and important fields in machine learning research. The existing class imbalance learning methods can make Support Vector Machines (SVMs) less sensitive to class imbalance; they still suffer from the disturbance of outliers and noise present in the datasets. A kind of Fuzzy Smooth Support Vector Machines (FSSVMs) are proposed based on the Smooth Support Vector Machine (SSVM) of O. L. Mangasarian. SSVM can be computed by the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm or the Newton-Armijo algorithm easily. Two kinds of fuzzy memberships and three smooth functions can be chosen in the algorithms. The fuzzy memberships consider the contribution rate of each sample to the optimal separating hyperplane. The polynomial smooth functions can make the optimization problem more accurate at the inflection point. Those changes play the active effects on trials. The results of the experiments show that the FSSVMs can gain the better accuracy and the shorter time than the SSVMs and some of the other methods.

Download Full-text

Imbalanced data classification via support vector machines and genetic algorithms

Connection Science ◽

10.1080/09540091.2014.924902 ◽

2014 ◽

Vol 26 (4) ◽

pp. 335-348 ◽

Cited By ~ 7

Author(s):

Jair Cervantes ◽

Xiaoou Li ◽

Wen Yu

Keyword(s):

Genetic Algorithms ◽

Support Vector Machines ◽

Imbalanced Data ◽

Data Classification ◽

Support Vector ◽

Imbalanced Data Classification ◽

Vector Machines

Download Full-text

A new adaptive weighted imbalanced data classifier via improved support vector machines with high-dimension nature

Knowledge-Based Systems ◽

10.1016/j.knosys.2019.104933 ◽

2019 ◽

Vol 185 ◽

pp. 104933 ◽

Cited By ~ 2

Author(s):

Kai Qi ◽

Hu Yang ◽

Qingyu Hu ◽

Dongjun Yang

Keyword(s):

Support Vector Machines ◽

High Dimension ◽

Imbalanced Data ◽

Support Vector ◽

Vector Machines

Download Full-text

The Application of Problems Concerning the Imbalanced Data Classification by Means of Support Vector Machines

2011 Fourth International Symposium on Knowledge Acquisition and Modeling ◽

10.1109/kam.2011.84 ◽

2011 ◽

Author(s):

Chen Qing

Keyword(s):

Support Vector Machines ◽

Imbalanced Data ◽

Data Classification ◽

Support Vector ◽

Imbalanced Data Classification ◽

Vector Machines

Download Full-text

A new classification strategy for human activity recognition using cost sensitive support vector machines for imbalanced data

Kybernetes ◽

10.1108/k-07-2014-0138 ◽

2014 ◽

Vol 43 (8) ◽

pp. 1150-1164 ◽

Cited By ~ 9

Author(s):

Bilal M’hamed Abidine ◽

Belkacem Fergani ◽

Mourad Oussalah ◽

Lamya Fergani

Keyword(s):

Support Vector Machines ◽

Probabilistic Models ◽

Conditional Random Fields ◽

Performance Metrics ◽

Imbalanced Data ◽

Sampling Technique ◽

Support Vector ◽

Data Set ◽

Content Type ◽

Vector Machines

Purpose – The task of identifying activity classes from sensor information in smart home is very challenging because of the imbalanced nature of such data set where some activities occur more frequently than others. Typically probabilistic models such as Hidden Markov Model (HMM) and Conditional Random Fields (CRF) are known as commonly employed for such purpose. The paper aims to discuss these issues. Design/methodology/approach – In this work, the authors propose a robust strategy combining the Synthetic Minority Over-sampling Technique (SMOTE) with Cost Sensitive Support Vector Machines (CS-SVM) with an adaptive tuning of cost parameter in order to handle imbalanced data problem. Findings – The results have demonstrated the usefulness of the approach through comparison with state of art of approaches including HMM, CRF, the traditional C-Support vector machines (C-SVM) and the Cost-Sensitive-SVM (CS-SVM) for classifying the activities using binary and ubiquitous sensors. Originality/value – Performance metrics in the experiment/simulation include Accuracy, Precision/Recall and F measure.

Download Full-text