scholarly journals An Algebraic Approach to Clustering and Classification with Support Vector Machines

Mathematics ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 128
Author(s):  
Güvenç Arslan ◽  
Uğur Madran ◽  
Duygu Soyoğlu

In this note, we propose a novel classification approach by introducing a new clustering method, which is used as an intermediate step to discover the structure of a data set. The proposed clustering algorithm uses similarities and the concept of a clique to obtain clusters, which can be used with different strategies for classification. This approach also reduces the size of the training data set. In this study, we apply support vector machines (SVMs) after obtaining clusters with the proposed clustering algorithm. The proposed clustering algorithm is applied with different strategies for applying SVMs. The results for several real data sets show that the performance is comparable with the standard SVM while reducing the size of the training data set and also the number of support vectors.

Author(s):  
Clyde Coelho ◽  
Aditi Chattopadhyay

This paper proposes a computationally efficient methodology for classifying damage in structural hotspots. Data collected from a sensor instrumented lug joint subjected to fatigue loading was preprocessed using a linear discriminant analysis (LDA) to extract features that are relevant for classification and reduce the dimensionality of the data. The data is then reduced in the feature space by analyzing the structure of the mapped clusters and removing the data points that do not affect the construction of interclass separating hyperplanes. The reduced data set is used to train a support vector machines (SVM) based classifier and the results of the classification problem are compared to those when the entire data set is used for training. To further improve the efficiency of the classification scheme, the SVM classifiers are arranged in a binary tree format to reduce the number of comparisons that are necessary. The experimental results show that the data reduction does not reduce the ability of the classifier to distinguish between classes while providing a nearly fourfold decrease in the amount of training data processed.


2019 ◽  
Vol 5 (1) ◽  
pp. 285-287
Author(s):  
Jannis Hagenah ◽  
Sascha Leymann ◽  
Floris Ernst

AbstractInference from medical image data using machine learning still suffers from the disregard of label uncertainty. Usually, medical images are labeled by multiple experts. However, the uncertainty of this training data, assessible as the unity of opinions of observers, is neglected as training is commonly performed on binary decision labels. In this work, we present a novel method to incorporate this label uncertainty into the learning problem using weighted Support Vector Machines (wSVM). The idea is to assign an uncertainty score to each data point. The score is between 0 and 1 and is calculated based on the unity of opinions of all observers, where u = 1 if all observers have the same opinion and u = 0 if the observers opinions are exactly 50/50, with linear interpolation in between. This score is integrated in the Support Vector Machine (SVM) optimization as a weighting of errors made for the corresponding data point. For evaluation, we asked 15 observers to label 48 2D ultrasound images of aortic roots addressing whether the images show a healthy or a pathologically dilated anatomy, where the ground truth was known. As the observers were not trained experts, a high diversity of opinions was present in the data set. We performed image classification using both approaches, i.e. classical SVM and wSVM with integrated uncertainty weighting, utilizing 10-fold Cross Validation, respectively (linear kernel, C = 7). By incorporating the observer uncertainty, the classification accuracy could be improved by 3.1 percentage points (SVM: 83.5%, wSVM: 86.6%). This indicates that integrating information on the observers’ unity of opinions increases the generalization performance of the classifier and that uncertainty weighted wSVM could present a promising method for machine learning in the medical domain.


2010 ◽  
Vol 39 ◽  
pp. 247-252
Author(s):  
Sheng Xu ◽  
Zhi Juan Wang ◽  
Hui Fang Zhao

A two-stage neural network architecture constructed by combining potential support vector machines (P-SVM) with genetic algorithm (GA) and gray correlation coefficient analysis (GCCA) is proposed for patent innovation factors evolution. The enterprises patent innovation is complex to conduct due to its nonlinearity of influenced factors. It is necessary to make a trade off among these factors when some of them conflict firstly. A novel way about nonlinear regression model with the potential support vector machines (P-SVM) is presented in this paper. In the model development, the genetic algorithm is employed to optimize P-SVM parameters selection. After the selected key factors by the PSVM with GA model, the main factors that affect patent innovation generation have been quantitatively studied using the method of gray correlation coefficient analysis. Using a set of real data in China, the results show that the methods developed in this paper can provide valuable information for patent innovation management and related municipal planning projects.


Author(s):  
Ribana Roscher ◽  
Jan Behmann ◽  
Anne-Katrin Mahlein ◽  
Jan Dupuis ◽  
Heiner Kuhlmann ◽  
...  

We analyze the benefit of combining hyperspectral images information with 3D geometry information for the detection of <i>Cercospora</i> leaf spot disease symptoms on sugar beet plants. Besides commonly used one-class Support Vector Machines, we utilize an unsupervised sparse representation-based approach with group sparsity prior. Geometry information is incorporated by representing each sample of interest with an inclination-sorted dictionary, which can be seen as an 1D topographic dictionary. We compare this approach with a sparse representation based approach without geometry information and One-Class Support Vector Machines. One-Class Support Vector Machines are applied to hyperspectral data without geometry information as well as to hyperspectral images with additional pixelwise inclination information. Our results show a gain in accuracy when using geometry information beside spectral information regardless of the used approach. However, both methods have different demands on the data when applied to new test data sets. One-Class Support Vector Machines require full inclination information on test and training data whereas the topographic dictionary approach only need spectral information for reconstruction of test data once the dictionary is build by spectra with inclination.


Author(s):  
Hsien-Chung Lin ◽  
Eugen Solowjow ◽  
Masayoshi Tomizuka ◽  
Edwin Kreuzer

This contribution presents a method to estimate environmental boundaries with mobile agents. The agents sample a concentration field of interest at their respective positions and infer a level curve of the unknown field. The presented method is based on support vector machines (SVMs), whereby the concentration level of interest serves as the decision boundary. The field itself does not have to be estimated in order to obtain the level curve which makes the method computationally very appealing. A myopic strategy is developed to pick locations that yield most informative concentration measurements. Cooperative operations of multiple agents are demonstrated by dividing the domain in Voronoi tessellations. Numerical studies demonstrate the feasibility of the method on a real data set of the California coastal area. The exploration strategy is benchmarked against random walk which it clearly outperforms.


Author(s):  
Mohammad Reza Daliri

AbstractIn this article, we propose a feature selection strategy using a binary particle swarm optimization algorithm for the diagnosis of different medical diseases. The support vector machines were used for the fitness function of the binary particle swarm optimization. We evaluated our proposed method on four databases from the machine learning repository, including the single proton emission computed tomography heart database, the Wisconsin breast cancer data set, the Pima Indians diabetes database, and the Dermatology data set. The results indicate that, with selected less number of features, we obtained a higher accuracy in diagnosing heart, cancer, diabetes, and erythematosquamous diseases. The results were compared with the traditional feature selection methods, namely, the F-score and the information gain, and a superior accuracy was obtained with our method. Compared to the genetic algorithm for feature selection, the results of the proposed method show a higher accuracy in all of the data, except in one. In addition, in comparison with other methods that used the same data, our approach has a higher performance using less number of features.


Kybernetes ◽  
2014 ◽  
Vol 43 (8) ◽  
pp. 1150-1164 ◽  
Author(s):  
Bilal M’hamed Abidine ◽  
Belkacem Fergani ◽  
Mourad Oussalah ◽  
Lamya Fergani

Purpose – The task of identifying activity classes from sensor information in smart home is very challenging because of the imbalanced nature of such data set where some activities occur more frequently than others. Typically probabilistic models such as Hidden Markov Model (HMM) and Conditional Random Fields (CRF) are known as commonly employed for such purpose. The paper aims to discuss these issues. Design/methodology/approach – In this work, the authors propose a robust strategy combining the Synthetic Minority Over-sampling Technique (SMOTE) with Cost Sensitive Support Vector Machines (CS-SVM) with an adaptive tuning of cost parameter in order to handle imbalanced data problem. Findings – The results have demonstrated the usefulness of the approach through comparison with state of art of approaches including HMM, CRF, the traditional C-Support vector machines (C-SVM) and the Cost-Sensitive-SVM (CS-SVM) for classifying the activities using binary and ubiquitous sensors. Originality/value – Performance metrics in the experiment/simulation include Accuracy, Precision/Recall and F measure.


Sign in / Sign up

Export Citation Format

Share Document