Multiclass Contour-Preserving Classification with Support Vector Machine (SVM)

AbstractMulticlass contour-preserving classification (MCOV) has been used to preserve the contour of the data set and improve the classification accuracy of a feed-forward neural network. It synthesizes two types of new instances, called fundamental multiclass outpost vector (FMCOV) and additional multiclass outpost vector (AMCOV), in the middle of the decision boundary between consecutive classes of data. This paper presents a comparison on the generalization of an inclusion of FMCOVs, AMCOVs, and both MCOVs on the final training sets with support vector machine (SVM). The experiments were carried out using MATLAB R2015a and LIBSVM v3.20 on seven types of the final training sets generated from each of the synthetic and real-world data sets from the University of California Irvine machine learning repository and the ELENA project. The experimental results confirm that an inclusion of FMCOVs on the final training sets having raw data can improve the SVM classification accuracy significantly.

Download Full-text

Multi-Classification of Fetal Health Status Using Extreme Learning Machine

ICONTECH INTERNATIONAL JOURNAL ◽

10.46291/icontechvol5iss2pp62-70 ◽

2021 ◽

Vol 5 (2) ◽

pp. 62-70

Author(s):

Ömer KASIM

Keyword(s):

Extreme Learning Machine ◽

Classification Accuracy ◽

Binary Classification ◽

Clinical Decision ◽

Support Vector ◽

Data Set ◽

Multiple Classification ◽

Learning Machine ◽

Multi Class Classification ◽

The University

Cardiotocography (CTG) is used for monitoring the fetal heart rate signals during pregnancy. Evaluation of these signals by specialists provides information about fetal status. When a clinical decision support system is introduced with a system that can automatically classify these signals, it is more sensitive for experts to examine CTG data. In this study, CTG data were analysed with the Extreme Learning Machine (ELM) algorithm and these data were classified as normal, suspicious and pathological as well as benign and malicious. The proposed method is validated with the University of California International CTG data set. The performance of the proposed method is evaluated with accuracy, f1 score, Cohen kappa, precision, and recall metrics. As a result of the experiments, binary classification accuracy was obtained as 99.29%. There was only 1 false positive. When multi-class classification was performed, the accuracy was obtained as 98.12%. The amount of false positives was found as 2. The processing time of the training and testing of the ELM algorithm were quite minimized in terms of data processing compared to the support vector machine and multi-layer perceptron. This result proved that a high classification accuracy was obtained by analysing the CTG data both binary and multiple classification.

Download Full-text

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Information Discovery and Delivery ◽

10.1108/idd-09-2018-0045 ◽

2019 ◽

Vol 47 (3) ◽

pp. 154-170

Author(s):

Janani Balakumar ◽

S. Vijayarani Mohan

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Classification ◽

Support Vector ◽

Data Sets ◽

Selection Algorithm ◽

Data Set ◽

Content Type ◽

Benchmark Data ◽

Bee Colony

Purpose Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content. Design/methodology/approach This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper. Findings The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy. Originality/value This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.

Download Full-text

Parallel Multi-Class Contour Preserving Classification

Journal of Intelligent Systems ◽

10.1515/jisys-2015-0038 ◽

2017 ◽

Vol 26 (1) ◽

pp. 109-121 ◽

Cited By ~ 1

Author(s):

Piyabute Fuangkhon

Keyword(s):

Data Sets ◽

Data Dependency ◽

Feed Forward Neural Network ◽

Real World Data ◽

Data Set ◽

Core System ◽

Cpu Utilization ◽

Thread Level Parallelism ◽

Level Parallelism ◽

Speedup Factor

AbstractSerial multi-class contour preserving classification can improve the representation of the contour of the data to improve the levels of classification accuracy for feed-forward neural network (FFNN). The algorithm synthesizes fundamental multi-class outpost vector (FMCOV) and additional multi-class outpost vector (AMCOV) at the decision boundary between consecutive classes of data to narrow the space of data. Both FMCOVs and AMCOVs will assist the FFNN to place the hyper-planes in such a way that can classify the data more accurately. However, the technique was designed to utilize only one processor. As a result, the execution time of the algorithm is significantly long. This article presents an improved version of the serial multi-class contour preserving classification that overcomes its time deficiency by utilizing thread-level parallelism to support parallel computing on multi-processor or multi-core system. The parallel algorithm distributes the data set and the processing of the FMCOV and AMCOV generators to be operated on available threads to increase the CPU utilization and the speedup factors of the FMCOV and AMCOV generators. The technique has been carefully designed to avoid data dependency issue. The experiments were conducted on both synthetic and real-world data sets. The experimental results confirm that the parallel multi-class contour preserving classification clearly outperforms the serial multi-class contour preserving classification in terms of CPU utilization and speedup factor.

Download Full-text

Performance Analysis of Some Competent Learners on Medical Data

International Journal of Knowledge-Based Organizations ◽

10.4018/ijkbo.2021010103 ◽

2021 ◽

Vol 11 (1) ◽

pp. 29-49

Author(s):

Amit Kumar ◽

Bikash Kanti Sarkar

Keyword(s):

Class Imbalance ◽

Disease Diagnosis ◽

Medical Data ◽

Feature Reduction ◽

University Of California ◽

Data Sets ◽

Data Set ◽

Selection Approach ◽

Feature Selection Approach ◽

The University

Research in disease diagnosis is a challenging task due to inconsistent, class imbalance, conflicting, and the high dimensionality of medical data sets. The excellent features of each data set play an important role in improving performance of classifiers that may follow either iterative or non-iterative approaches. In the present study, a comparative study is carried out to show the performance of iterative and non-iterative classifiers in combination with genetic algorithm (GA)-based feature selection approach over some widely used medical data sets. The experiment assists to identify the clinical data sets for which feature reduction is necessary for improving performance of classifiers. For iterative approaches, two popular classifiers, namely C4.5 and RIPPER, are chosen, whereas k-NN and naïve Bayes are taken as non-iterative learners. Fourteen real-world medical domain data sets are selected from the University of California, Irvine (UCI Repository) for conducting experiment over the learners.

Download Full-text

Measurement of food colour in Lab* units from RGB digital image using least squares support vector machine regression

Journal of Agricultural Engineering ◽

10.4081/jae.2015.482 ◽

2015 ◽

Vol 46 (4) ◽

pp. 138 ◽

Cited By ~ 1

Author(s):

Roberto Romaniello ◽

Alessandro Leone ◽

Giorgio Peri

Keyword(s):

Support Vector Machine ◽

Least Squares ◽

Low Cost ◽

Digital Camera ◽

Support Vector ◽

Feed Forward Neural Network ◽

Data Set ◽

Food Colour ◽

Input Variables ◽

Validation Set

The aim of this work is to evaluate the potential of least squares support vector machine (LS-SVM) regression to develop an efficient method to measure the colour of food materials in L*a*b* units by means of a computer vision systems (CVS). A laboratory CVS, based on colour digital camera (CDC), was implemented and three LS-SVM models were trained and validated, one for each output variables (L*, a*, and b*) required by this problem, using the RGB signals generated by the CDC as input variables to these models. The colour target-based approach was used to camera characterization and a standard reference target of 242 colour samples was acquired using the CVS and a colorimeter. This data set was split in two sets of equal sizes, for training and validating the LS-SVM models. An effective two-stage grid search process on the parameters space was performed in MATLAB to tune the regularization parameters γ and the kernel parameters σ2 of the three LS-SVM models. A 3-8-3 multilayer feed-forward neural network (MFNN), according to the research conducted by León et al. (2006), was also trained in order to compare its performance with those of LS-SVM models. The LS-SVM models developed in this research have been shown better generalization capability then the MFNN, allowed to obtain high correlations between L*a*b* data acquired using the colorimeter and the corresponding data obtained by transformation of the RGB data acquired by the CVS. In particular, for the validation set, R2 values equal to 0.9989, 0.9987, and 0.9994 for L*, a* and b* parameters were obtained. The root mean square error values were 0.6443, 0.3226, and 0.2702 for L*, a*, and b* respectively, and the average of colour differences ΔEab was 0.8232±0.5033 units. Thus, LS-SVM regression seems to be a useful tool to measurement of food colour using a low cost CVS.

Download Full-text

Pengaruh Algoritma ADASYN dan SMOTE terhadap Performa Support Vector Machine pada Ketidakseimbangan Dataset Airbnb

EDUMATIC Jurnal Pendidikan Informatika ◽

10.29408/edumatic.v5i1.3125 ◽

2021 ◽

Vol 5 (1) ◽

pp. 11-20

Author(s):

Wahyu Hidayat ◽

◽

Mursyid Ardiansyah ◽

Arief Setyanto ◽

◽

...

Keyword(s):

Support Vector Machine ◽

Confusion Matrix ◽

Sampling Technique ◽

Host Population ◽

Support Vector ◽

Data Sets ◽

Test Results ◽

Data Set ◽

Tourist Attractions ◽

Svm Algorithm

Traveling activities are increasingly being carried out by people in the world. Some tourist attractions are difficult to reach hotels because some tourist attractions are far from the city center, Airbnb is a platform that provides home or apartment-based rentals. In lodging offers, there are two types of hosts, namely non-super host and super host. The super-host badge is obtained if the innkeeper has a good reputation and meets the requirements. There are advantages to being a super host such as having more visibility, increased earning potential and exclusive rewards. Support Vector Machine (SVM) algorithm classification process by these criteria data. Data set is unbalanced. The super host population is smaller than the non-super host. Overcoming the imbalance, this over sampling technique is carried out using ADASYN and SMOTE. Research goal was to decide the performance of ADASYN and sampling technique, SVM algorithm. Data analyses used over sampling which aims to handle unbalanced data sets, and confusion matrix used for testing Precision, Recall, and F1-SCORE, and Accuracy. Research shows that SMOTE SVM increases the accuracy rate by 1 percent from 80% to 81%, which is influenced by the increase in the True (minority) label test results and a decrease in the False label test results (majority), the SMOTE SVM is better than ADASYN SVM, and SVM without over sampling.

Download Full-text

Weed Detection and Classification using ICA Based SVM Classifier

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c5410.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 1557-1560

Keyword(s):

Support Vector Machine ◽

Classification Accuracy ◽

Learning Algorithm ◽

Feature Weighting ◽

Training Data ◽

Support Vector ◽

Svm Classifier ◽

Classification Problems ◽

Weed Detection ◽

Data Set

Support vector machine (SVM) is a commonly known efficient supervised learning algorithm for classification problems. However, the classification accuracy of the SVM classifier depends on its training parameters and the training data set as well. The main objective of this paper is to optimize its parameters and feature weighting in order to improve the strength of the SVM simultaneously. In this paper, the Imperialist Competitive Algorithm based Support Vector Machine (ICA-SVM) classifier is proposed to classify the efficient weed detection. This enhanced ICA-SVM classifier is able to select the appropriate input features and to optimize the parameters of SVM and is improving the classification accuracy. Experimental results show that the ICA-SVM classification algorithm reduces the computational complexity tremendously and improves classification Accuracy.

Download Full-text

Support Vector Machines for Dyadic Data

Neural Computation ◽

10.1162/neco.2006.18.6.1472 ◽

2006 ◽

Vol 18 (6) ◽

pp. 1472-1510 ◽

Cited By ~ 56

Author(s):

Sepp Hochreiter ◽

Klaus Obermayer

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

New Technique ◽

Support Vector ◽

Data Sets ◽

Real World Data ◽

Scale Invariant ◽

Dyadic Data ◽

Regression Functions ◽

Classification And Regression

We describe a new technique for the analysis of dyadic data, where two sets of objects (row and column objects) are characterized by a matrix of numerical values that describe their mutual relationships. The new technique, called potential support vector machine (P-SVM), is a large-margin method for the construction of classifiers and regression functions for the column objects. Contrary to standard support vector machine approaches, the P-SVM minimizes a scale-invariant capacity measure and requires a new set of constraints. As a result, the P-SVM method leads to a usually sparse expansion of the classification and regression functions in terms of the row rather than the column objects and can handle data and kernel matrices that are neither positive definite nor square. We then describe two complementary regularization schemes. The first scheme improves generalization performance for classification and regression tasks; the second scheme leads to the selection of a small, informative set of row support objects and can be applied to feature selection. Benchmarks for classification, regression, and feature selection tasks are performed with toy data as well as with several real-world data sets. The results show that the new method is at least competitive with but often performs better than the benchmarked standard methods for standard vectorial as well as true dyadic data sets. In addition, a theoretical justification is provided for the new approach.

Download Full-text

Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network

Journal of Intelligent Systems ◽

10.1515/jisys-2015-0039 ◽

2017 ◽

Vol 26 (2) ◽

pp. 335-358 ◽

Cited By ~ 1

Author(s):

Piyabute Fuangkhon

Keyword(s):

Neural Network ◽

Classification Accuracy ◽

Synthetic Data ◽

Original Data ◽

Instance Selection ◽

Data Sets ◽

Decision Boundary ◽

Feed Forward Neural Network ◽

Data Set ◽

Reduced Data

AbstractInstance selection endeavors to decide which instances from the data set should be maintained for further use during the learning process. It can result in increased generalization of the learning model, shorter time of the learning process, or scaling up to large data sources. This paper presents a parallel distance-based instance selection approach for a feed-forward neural network (FFNN), which can utilize all available processing power to reduce the data set while obtaining similar levels of classification accuracy as when the original data set is used. The algorithm identifies the instances at the decision boundary between consecutive classes of data, which are essential for placing hyperplane decision surfaces, and retains these instances in the reduced data set (subset). Each identified instance, called a prototype, is one of the representatives of the decision boundary of its class that constitutes the shape or distribution model of the data set. No feature or dimension is sacrificed in the reduction process. Regarding reduction capability, the algorithm obtains approximately 85% reduction power on non-overlapping two-class synthetic data sets, 70% reduction power on highly overlapping two-class synthetic data sets, and 77% reduction power on multiclass real-world data sets. Regarding generalization, the reduced data sets obtain similar levels of classification accuracy as when the original data set is used on both FFNN and support vector machine. Regarding execution time requirement, the speedup of the parallel algorithm over the serial algorithm is proportional to the number of threads the processor can run concurrently.

Download Full-text

Generalized Support Vector Machines (GSVMs) model for real-world time series forecasting

10.21203/rs.3.rs-180407/v1 ◽

2021 ◽

Author(s):

Mehrnaz Ahmadi ◽

Mehdi Khashei

Keyword(s):

Support Vector Machine ◽

Support Vector Machines ◽

Real World ◽

Support Vector ◽

Data Sets ◽

Fuzzy Support Vector Machine ◽

Data Set ◽

Proposed Model ◽

Vector Machines ◽

Forecasting Performance

Abstract Support vector machines (SVMs) are one of the most popular and widely-used approaches in modeling. Various kinds of SVM models have been developed in the literature of prediction and classification in order to cover different purposes. Fuzzy and crisp support vector machines are a well-known branch of modeling approaches that frequently applied for certain and uncertain modeling, respectively. However, each of these models can only be efficiently used in its specified domain and cannot yield appropriate and accurate results if the opposite situations have occurred. While the real-world systems and data sets often contain both certain and uncertain patterns that are complicatedly mixed together and need to be simultaneously modeled. In this paper, a generalized support vector machine (GSVM) is proposed that can simultaneously benefit the unique advantages of certain and uncertain versions of the traditional support vector machines in their own specialized categories. In the proposed model, the underlying data set is first categorized into two classes of certain and uncertain patterns. Then, certain patterns are modeled by a support vector machine, and uncertain patterns are modeled by a fuzzy support vector machine. After that, the function of the relationship, as well as the relative importance of each component, are estimated by another support vector machine, and subsequently, the final forecasts of the proposed model are calculated. Empirical results of wind speed forecasting indicate that the proposed method not only can achieve more accurate results than support vector machines (SVMs) and fuzzy support vector machines (FSVMs) but also can yield better forecasting performance than traditional fuzzy and nonfuzzy single models and traditional preprocessing-based hybrid models of SVMs.

Download Full-text

Multiclass Contour-Preserving Classification with Support Vector Machine (SVM)

Multi-Classification of Fetal Health Status Using Extreme Learning Machine

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Parallel Multi-Class Contour Preserving Classification

Performance Analysis of Some Competent Learners on Medical Data

Measurement of food colour in L*a*b* units from RGB digital image using least squares support vector machine regression

Pengaruh Algoritma ADASYN dan SMOTE terhadap Performa Support Vector Machine pada Ketidakseimbangan Dataset Airbnb

Weed Detection and Classification using ICA Based SVM Classifier

Support Vector Machines for Dyadic Data

Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network

Generalized Support Vector Machines (GSVMs) model for real-world time series forecasting

Measurement of food colour in Lab* units from RGB digital image using least squares support vector machine regression