Linguistic Rule Extraction from Support Vector Machine Classifiers

Author(s):  
Xiuju Fu ◽  
Lipo Wang ◽  
GihGuang Hung ◽  
Liping Goh

Classification decisions from linguistic rules are more desirable compared to complex mathematical formulas from support vector machine (SVM) classifiers due to the explicit explanation capability of linguistic rules. Linguistic rule extraction has been attracting much attention in explaining knowledge hidden in data. In this chapter, we show that the decisions from an SVM classifier can be decoded into linguistic rules based on the information provided by support vectors and decision function. Given a support vector of a certain class, cross points between each line, which is extended from the support vector along each axis, and an SVM decision hyper-curve are searched first. A hyper-rectangular rule is derived from these cross points. The hyper-rectangle is tuned by a tuning phase in order to exclude those out-class data points. Finally, redundant rules are merged to produce a compact rule set. Simultaneously, important attributes could be highlighted in the extracted rules. Rule extraction results from our proposed method could follow SVM classifier decisions very well. We compare the rule extraction results from SVM with RBF kernel function and linear kernel function. Experiment results show that rules extracted from SVM with RBF nonlinear kernel function are with better accuracy than rules extracted from SVM with linear kernel function. Comparisons between our method and other rule extraction methods are also carried out on several benchmark data sets. Higher rule accuracy is obtained in our method with fewer number of premises in each rule.

2008 ◽  
pp. 1269-1279
Author(s):  
Xiuju Fu ◽  
Lipo Wang ◽  
GihGuang Hung ◽  
Liping Goh

Classification decisions from linguistic rules are more desirable compared to complex mathematical formulas from support vector machine (SVM) classifiers due to the explicit explanation capability of linguistic rules. Linguistic rule extraction has been attracting much attention in explaining knowledge hidden in data. In this chapter, we show that the decisions from an SVM classifier can be decoded into linguistic rules based on the information provided by support vectors and decision function. Given a support vector of a certain class, cross points between each line, which is extended from the support vector along each axis, and an SVM decision hyper-curve are searched first. A hyper-rectangular rule is derived from these cross points. The hyper-rectangle is tuned by a tuning phase in order to exclude those out-class data points. Finally, redundant rules are merged to produce a compact rule set. Simultaneously, important attributes could be highlighted in the extracted rules. Rule extraction results from our proposed method could follow SVM classifier decisions very well. We compare the rule extraction results from SVM with RBF kernel function and linear kernel function. Experiment results show that rules extracted from SVM with RBF nonlinear kernel function are with better accuracy than rules extracted from SVM with linear kernel function. Comparisons between our method and other rule extraction methods are also carried out on several benchmark data sets. Higher rule accuracy is obtained in our method with fewer number of premises in each rule.


Processes ◽  
2019 ◽  
Vol 7 (5) ◽  
pp. 263 ◽  
Author(s):  
Tao Xie ◽  
Jun Yao ◽  
Zhiwei Zhou

As is well known, the correct diagnosis for cancer is critical to save patients’ lives. Support vector machine (SVM) has already made an important contribution to the field of cancer classification. However, different kernel function configurations and their parameters will significantly affect the performance of SVM classifier. To improve the classification accuracy of SVM classifier for cancer diagnosis, this paper proposed a novel cancer classification algorithm based on the dragonfly algorithm and SVM with a combined kernel function (DA-CKSVM) which was constructed from a radial basis function (RBF) kernel and a polynomial kernel. Experiments were performed on six cancer data sets from University of California, Irvine (UCI) machine learning repository and two cancer data sets from Cancer Program Legacy Publication Resources to evaluate the validity of the proposed algorithm. Compared with four well-known algorithms: dragonfly algorithm-SVM (DA-SVM), particle swarm optimization-SVM (PSO-SVM), bat algorithm-SVM (BA-SVM), and genetic algorithm-SVM (GA-SVM), the proposed algorithm was able to find the optimal parameters of SVM classifier and achieved better classification accuracy on cancer datasets.


Author(s):  
B. Yekkehkhany ◽  
A. Safari ◽  
S. Homayouni ◽  
M. Hasanlou

In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). <br><br> The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.


Author(s):  
Suhas S ◽  
Dr. C. R. Venugopal

An enhanced classification system for classification of MR images using association of kernels with support vector machine is developed and presented in this paper along with the design and development of content-based image retrieval (CBIR) system. Content of image retrieval is the process of finding relevant image from large collection of image database using visual queries. Medical images have led to growth in large image collection. Oriented Rician Noise Reduction Anisotropic Diffusion filter is used for image denoising. A modified hybrid Otsu algorithm termed is used for image segmentation. The texture features are extracted using GLCM method. Genetic algorithm with Joint entropy is adopted for feature selection. The classification is done by support vector machine along with various kernels and the performance is validated. A classification accuracy of 98.83% is obtained using SVM with GRBF kernel. Various features have been extracted and these features are used to classify MR images into five different categories. Performance of the MC-SVM classifier is compared with different kernel functions. From the analysis and performance measures like classification accuracy, it is inferred that the brain and spinal cord MRI classification is best done using MC- SVM with Gaussian RBF kernel function than linear and polynomial kernel functions. The proposed system can provide best classification performance with high accuracy and low error rate.


2012 ◽  
Vol 198-199 ◽  
pp. 1280-1285 ◽  
Author(s):  
Shang Fu Gong ◽  
Juan Chen

The widely use of P2P (Peer-to-Peer) technology has caused resources take up too much, security risks and other problems, it is necessary to detect and control P2P traffic. After analyzing current P2P detection methods, a new method called TCBDM (Traffic Characters Based Detection Method) is put forward which combines P2P traffic character with support vector machine to detect P2P traffic. By choosing P2P traffic characters which differ from other network traffic, such as Round-Trip Time (RTT), the method creates a SVM classifier, uses a package named LIBSVM to classify P2P traffic in Moore_Set data sets. The result shows that TCBDM can detect P2P traffic effectively; the accuracy could reach 98%.


2014 ◽  
Vol 687-691 ◽  
pp. 3897-3900 ◽  
Author(s):  
Ping An Wang ◽  
Xu Sheng Gan ◽  
Deng Kai Yao

The selection of kernel function in Support Vector Machine (SVM) has a great influence on the model performance. In the paper, Mexico hat wavelet kernel is introduced to employ the kernel function of SVM, and theoretically it has be prove that, Mexico hat wavelet kernel satisfies the Merce condition, that is the necessary condition as the kernel function of SVM. Simulation on the anomaly detection shows that the capability of SVM based on Mexico hat wavelet kernel is better than that of SVM based on RBF kernel with a satisfactory result for anomaly intrusion detection.


Heart arrhythmias are the different types of heartbeats which are irregular in nature. In Tachycardia the heartbeat works too fast and in case of Bradycardia it works too slow. In the study of different cardiac conditions automatic detection of heart arrhythmia is done by the classification and feature extraction of Electrocardiogram(ECG) data. Various Support Vector Machine based methods are used to analyze and classify ECG signals for arrhythmia detection. There are several Support Vector Machine (SVM) methods used to classify the ECG data such as one against all, one against one and fuzzy decision function. This classification detects the existence of the arrhythmia and it helps the physicians to treat the heart patient with more accurate way. To train SVM, the MIT BIH Arrhythmia database is used which works with the heart disorder like sinus bradycardy, old inferior myocardial infarction, coronary artery disease, right bundle branch block. All three methods are implemented in proper way, and their rate of accuracy with SVM classifier is optimal when it is processed with the one-against-all method. The data sets of ECG arrhythmia are usually complex in nature, so for the SVM based classification one-against-all method has great impact and will fetch better result.


2018 ◽  
Vol 61 (1) ◽  
pp. 64-76 ◽  
Author(s):  
Susan (Sixue) Jia

Fitness clubs have never ceased searching for quality improvement opportunities to better serve their exercisers, whereas exercisers have been posting online ratings and reviews regarding fitness clubs. Studied together, the quantitative rating and qualitative review can provide a comprehensive depiction of exercisers’ perception of fitness clubs. However, the typological and dimensional discrepancies of online rating and review have hindered the joint study of the two data sets to fully exploit their business value. To this end, this study bridges the gap by examined 53,979 pairs of exerciser online rating and review from 100 fitness clubs in Shanghai, China. Using latent Dirichlet allocation (LDA) based text mining, we identified the 17 major topics on which the exercisers were writing. A support vector machine (SVM) classifier was then employed to establish the rating-review relations, with an accuracy rate of up to 86%. Finally, the relative impact of each topic on exerciser satisfaction was computed and compared by introducing virtual reviews. The significance of this study is that it systematically creates a standardized protocol of mining and correlating the massive structured/quantitative and unstructured/qualitative data available online, which is readily transferable to the other service and product sectors.


Author(s):  
Intisar Shadeed Al-Mejibli ◽  
Jwan K. Alwan ◽  
Dhafar Hamed Abd

Currently, the support vector machine (SVM) regarded as one of supervised machine learning algorithm that provides analysis of data for classification and regression. This technique is implemented in many fields such as bioinformatics, face recognition, text and hypertext categorization, generalized predictive control and many other different areas. The performance of SVM is affected by some parameters, which are used in the training phase, and the settings of parameters can have a profound impact on the resulting engine’s implementation. This paper investigated the SVM performance based on value of gamma parameter with used kernels. It studied the impact of gamma value on (SVM) efficiency classifier using different kernels on various datasets descriptions. SVM classifier has been implemented by using Python. The kernel functions that have been investigated are polynomials, radial based function (RBF) and sigmoid. UC irvine machine learning repository is the source of all the used datasets. Generally, the results show uneven effect on the classification accuracy of three kernels on used datasets. The changing of the gamma value taking on consideration the used dataset influences polynomial and sigmoid kernels. While the performance of RBF kernel function is more stable with different values of gamma as its accuracy is slightly changed.


Sign in / Sign up

Export Citation Format

Share Document