Imbalanced Support Vector Machine Classification Based on Hyper-Sphere

2013 ◽  
Vol 339 ◽  
pp. 384-388
Author(s):  
Cun He Li ◽  
Rui Xue Chen ◽  
Yi Zhao Ouyang

In classification, when the distribution of the training data between classes is uneven, the learning algorithm is generally dominated by the feature of the majority classes. Features in the minority classes are normally difficult to be fully recognized. Hyper-sphere support vector machine is an important method for unbalanced classification which is an important issue, but this algorithm has a defect. In order to significantly improve the classification performance of imbalanced datasets, we propose a new method based on Generalized Hyper-sphere Support Vector Machine to enhance the classification accuracy for the minority classes. Support vector machine (SVM) is then used as the base classifier to train the reprocessed dataset. Our experimental results demonstrate that the proposed selection technique improves the classification rate of the rare events, and it also improves the overall accuracy of SVM without data pre-processing.

2020 ◽  
Vol 8 (5) ◽  
pp. 1557-1560

Support vector machine (SVM) is a commonly known efficient supervised learning algorithm for classification problems. However, the classification accuracy of the SVM classifier depends on its training parameters and the training data set as well. The main objective of this paper is to optimize its parameters and feature weighting in order to improve the strength of the SVM simultaneously. In this paper, the Imperialist Competitive Algorithm based Support Vector Machine (ICA-SVM) classifier is proposed to classify the efficient weed detection. This enhanced ICA-SVM classifier is able to select the appropriate input features and to optimize the parameters of SVM and is improving the classification accuracy. Experimental results show that the ICA-SVM classification algorithm reduces the computational complexity tremendously and improves classification Accuracy.


2018 ◽  
Vol 21 (62) ◽  
pp. 1
Author(s):  
Jorge E. Camargo ◽  
Vladimir Vargas-Calderon ◽  
Nelson Vargas ◽  
Liliana Calderón-Benavides

With the purpose of classifying text based on its sentiment polarity (positive or negative), we proposed an extension of a 68,000 tweets corpus through the inclusion of word definitions from a dictionary of the Real Academia Espa\~{n}ola de la Lengua (RAE). A set of 28,000 combinations of 6 Word2Vec and support vector machine parameters were considered in order to evaluate how positively would affect the inclusion of a RAE's dictionary definitions classification performance. We found that such a corpus extension significantly improve the classification accuracy. Therefore, we conclude that the inclusion of a RAE's dictionary increases the semantic relations learned by Word2Vec allowing a better classification accuracy.


2012 ◽  
Vol 461 ◽  
pp. 818-821
Author(s):  
Shi Hu Zhang

The problem of real estate prices are the current focus of the community's concern. Support Vector Machine is a new machine learning algorithm, as its excellent performance of the study, and in small samples to identify many ways, and so has its unique advantages, is now used in many areas. Determination of real estate price is a complicated problem due to its non-linearity and the small quantity of training data. In this study, support vector machine (SVM) is proposed to forecast the price of real estate price in China. The experimental results indicate that the SVM method can achieve greater accuracy than grey model, artificial neural network under the circumstance of small training data. It was also found that the predictive ability of the SVM outperformed those of some traditional pattern recognition methods for the data set used here.


Author(s):  
Boyang Li ◽  
◽  
Jinglu Hu ◽  
Kotaro Hirasawa

We propose an improved support vector machine (SVM) classifier by introducing a new offset, for solving the real-world unbalanced classification problem. The new offset is calculated based on the unbalanced support vectors resulting from the unbalanced training data. We developed a weighted harmonic mean (WHM) algorithm to further reduce the effects of noise on offset calculation. We apply the proposed approach to classify real-world data. Results of simulation demonstrate the effectiveness of our proposed approach.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Xin Wang ◽  
Yue Yang ◽  
Mingsong Chen ◽  
Qin Wang ◽  
Qin Qin ◽  
...  

Aiming at low classification accuracy of imbalanced datasets, an oversampling algorithm—AGNES-SMOTE (Agglomerative Nesting-Synthetic Minority Oversampling Technique) based on hierarchical clustering and improved SMOTE—is proposed. Its key procedures include hierarchically cluster majority samples and minority samples, respectively; divide minority subclusters on the basis of the obtained majority subclusters; select “seed sample” based on the sampling weight and probability distribution of minority subcluster; and restrict the generation of new samples in a certain area by centroid method in the sampling process. The combination of AGNES-SMOTE and SVM (Support Vector Machine) is presented to deal with imbalanced datasets classification. Experiments on UCI datasets are conducted to compare the performance of different algorithms mentioned in the literature. Experimental results indicate AGNES-SMOTE excels in synthesizing new samples and improves SVM classification performance on imbalanced datasets.


Author(s):  
B. Abbasi ◽  
H. Arefi ◽  
B. Bigdeli ◽  
S. Roessner

An image classification method based on Support Vector Machine (SVM) is proposed on hyperspectral and 3K DSM data. To obtain training data we applied an automatic method relating to four classes namely; building, grass, tree, and ground pixels. First, some initial segments regarding to building, tree, grass, and ground pixels are produced using different feature descriptors. The feature descriptors are generated using optical (hyperspectral) as well as range (3K DSM) images. The initial building regions are created using DSM segmentation. Fusion of NDVI and elevation information assist us to provide initial segments regarding to the grass and tree areas. Also, we created initial segment regarding to ground pixel after geodesic based filtering of DSM and elimination of the non-ground pixels. To improve classification accuracy, the hyperspectral image and 3K DSM were utilized simultaneously to perform image classification. For obtaining testing data, labelled pixels was divide into two parts: test and training. Experimental result shows a final classification accuracy of about 90% using Support Vector Machine. In the process of satellite image classification; provided by 3K camera. Both datasets correspond to Munich area in Germany.


2014 ◽  
Vol 687-691 ◽  
pp. 2693-2697
Author(s):  
Li Ding ◽  
Li Mao ◽  
Xiao Feng Wang

One single machine learning algorithm presents shortcomings when the data environment changes in the process of application. This article puts forward a heteromorphic ensemble learning model made up of bayes, support vector machine (SVM) and decision tree which classifies P2P traffic by voting principle. The experiment shows that the model can significantly improve the classification accuracy, and has a good stability.


Author(s):  
XULEI YANG ◽  
QING SONG ◽  
YUE WANG

This paper presents a weighted support vector machine (WSVM) to improve the outlier sensitivity problem of standard support vector machine (SVM) for two-class data classification. The basic idea is to assign different weights to different data points such that the WSVM training algorithm learns the decision surface according to the relative importance of data points in the training data set. The weights used in WSVM are generated by a robust fuzzy clustering algorithm, kernel-based possibilistic c-means (KPCM) algorithm, whose partition generates relative high values for important data points but low values for outliers. Experimental results indicate that the proposed method reduces the effect of outliers and yields higher classification rate than standard SVM does when outliers exist in the training data set.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3844 ◽  
Author(s):  
Zhao ◽  
Li ◽  
Xiao ◽  
Meng ◽  
Han ◽  
...  

Drift is an important issue that impairs the reliability of sensors, especially in gas sensors. The conventional method usually adopts the reference gas to compensate for the drift. However, its classification accuracy is not high. We propose a supervised learning algorithm that is based on multi-classifier integration for drift compensation in this paper, which incorporates drift compensation into the classification process, motivated by the fact that the goal of drift compensation is to improve the classification performance. In our method, with the obtained characteristics of sensors and the advantage of Support Vector Machine (SVM) in few-shot classification, the improved Long Shot Term Memory (LSTM) is integrated to build the multi-class classifier model. We tested the proposed approach on the publicly available time series dataset that was collected over three years by the metal-oxide gas sensors. The results clearly indicate the superiority of multiple classifier approach, which achieves higher classification accuracy as compared with different approaches during testing period with an ensemble of classifiers in the presence of sensor drift over time.


2020 ◽  
Vol 10 (17) ◽  
pp. 5955 ◽  
Author(s):  
Muhammad Hussain Khan ◽  
Zainab Saleem ◽  
Muhammad Ahmad ◽  
Ahmed Sohaib ◽  
Hamail Ayaz ◽  
...  

The quality of red chili is characterized based on its color and pungency. Several factors like humidity, temperature, light, and storage conditions affect the characteristic qualities of red chili, thus preservation required several measures. Instead of ensuring these measures, traders are using oil and Sudan dye in red chili to increase the value of an inferior product. Thus, this work presents the feasibility of utilizing a hyperspectral camera for the detection of oil and Sudan dye in red chili. This study describes the important wavelengths (500–700 nm) where different adulteration affects the response of the reflected spectrum. These wavelengths are then utilized for training an Support Vector Machine (SVM) algorithm to detect pure, oil-adulterated, and Sudan dye-adulterated red chili. The classification performance achieves 97% with the reduced dimensionality and 100% with complete validation data. The trained algorithm is further tested on separate data with different adulteration levels in comparison to the training data. Results show that the algorithm successfully classifies pure, oil-adulterated, and Sudan-adulterated red chili with an accuracy of 100%.


Sign in / Sign up

Export Citation Format

Share Document