BVDT: A Boosted Vector Decision Tree Algorithm for Multi-Class Classification Problems

Author(s):  
Kaiyuan Wu ◽  
Zhiming Zheng ◽  
Shaoting Tang

In this paper, we propose a powerful weak learner (Vector Decision Tree (VDT)) and a new Boosted Vector Decision Tree (BVDT) algorithm framework for the task of multi-class classification. Unlike the traditional scalar valued boosting algorithms, the BVDT algorithm directly maps the feature space to the decision space in the multi-class setting, which facilitates convenient implementations of the multi-class classification algorithms using diverse loss functions. By viewing the explicit hard threshold on the leaf node value applied in the LogitBoost as a constraint optimization problem, we further develop two new variants of the BVDT algorithm: the [Formula: see text]-BVDT and the [Formula: see text]-BVDT. The performance of the proposed algorithm is evaluated on different datasets and compared with three state-of-the-art boosting algorithms, [Formula: see text]-Nearest Neighbor (KNN) and Support Vector Machine (SVM). The results show that the performance of the proposed algorithm ranks first in all but one dataset and reduces the test error rate by 4% up to 58% with respect to the state-of-the-art boosting algorithms based on the scalar-valued weak learner. Furthermore, we present a case study on the Abalone dataset by designing a new loss function that combines the negative log-likelihood loss function of classification problem and square loss function of regression problem.

2012 ◽  
Vol 2012 ◽  
pp. 1-24 ◽  
Author(s):  
Lei La ◽  
Qiao Guo ◽  
Dequan Yang ◽  
Qimin Cao

AdaBoost is an excellent committee-based tool for classification. However, its effectiveness and efficiency in multiclass categorization face the challenges from methods based on support vector machine (SVM), neural networks (NN), naïve Bayes, andk-nearest neighbor (kNN). This paper uses a novel multi-class AdaBoost algorithm to avoid reducing the multi-class classification problem to multiple two-class classification problems. This novel method is more effective. In addition, it keeps the accuracy advantage of existing AdaBoost. An adaptive group-basedkNN method is proposed in this paper to build more accurate weak classifiers and in this way control the number of basis classifiers in an acceptable range. To further enhance the performance, weak classifiers are combined into a strong classifier through a double iterative weighted way and construct an adaptive group-basedkNN boosting algorithm (AGkNN-AdaBoost). We implement AGkNN-AdaBoost in a Chinese text categorization system. Experimental results showed that the classification algorithm proposed in this paper has better performance both in precision and recall than many other text categorization methods including traditional AdaBoost. In addition, the processing speed is significantly enhanced than original AdaBoost and many other classic categorization algorithms.


2018 ◽  
Vol 30 (03) ◽  
pp. 1850019
Author(s):  
Fatemeh Alimardani ◽  
Reza Boostani

Fingerprint verification systems have attracted much attention in secure organizations; however, conventional methods still suffer from unconvincing recognition rate for noisy fingerprint images. To design a robust verification system, in this paper, wavelet and contourlet transforms (CTS) were suggested as efficient feature extraction techniques to elicit a coverall set of descriptive features to characterize fingerprint images. Contourlet coefficients capture the smooth contours of fingerprints while wavelet coefficients reveal its rough details. Due to the high dimensionality of the elicited features, across group variance (AGV), greedy overall relevancy (GOR) and Davis–Bouldin fast feature reduction (DB-FFR) methods were adopted to remove the redundant features. These features were applied to three different classifiers including Boosting Direct Linear Discriminant Analysis (BDLDA), Support Vector Machine (SVM) and Modified Nearest Neighbor (MNN). The proposed method along with state-of-the-art methods were evaluated, over the FVC2004 dataset, in terms of genuine acceptance rate (GAR), false acceptance rate (FAR) and equal error rate (EER). The features selected by AGV were the most significant ones and provided 95.12% GAR. Applying the selected features, by the GOR method, to the modified nearest neighbor, resulted in average EER of [Formula: see text]%, which outperformed the compared methods. The comparative results imply the statistical superiority ([Formula: see text]) of the proposed approach compared to the counterparts.


2014 ◽  
Vol 519-520 ◽  
pp. 644-650
Author(s):  
Mian Shui Yu ◽  
Yu Xie ◽  
Xiao Meng Xie

Age classification based on facial images is attracting wide attention with its broad application to human-computer interaction (HCI). Since human senescence is a tremendously complex process, age classification is still a highly challenging issue. In our study, Local Directional Pattern (LDP) and Gabor wavelet transform were used to extract global and local facial features, respectively, that were fused based on information fusion theory. The Principal Component Analysis (PCA) method was used for dimensionality reduction of the fused features, to obtain a lower-dimensional age characteristic vector. A Support Vector Machine (SVM) multi-class classifier with Error Correcting Output Codes (ECOC) was proposed in the paper. This was aimed at multi-class classification problems, such as age classification. Experiments on a public FG-NET age database proved the efficiency of our method.


2020 ◽  
Author(s):  
Hoda Heidari ◽  
Zahra Einalou ◽  
Mehrdad Dadgostar ◽  
Hamidreza Hosseinzadeh

Abstract Most of the studies in the field of Brain-Computer Interface (BCI) based on electroencephalography have a wide range of applications. Extracting Steady State Visual Evoked Potential (SSVEP) is regarded as one of the most useful tools in BCI systems. In this study, different methods such as feature extraction with different spectral methods (Shannon entropy, skewness, kurtosis, mean, variance) (bank of filters, narrow-bank IIR filters, and wavelet transform magnitude), feature selection performed by various methods (decision tree, principle component analysis (PCA), t-test, Wilcoxon, Receiver operating characteristic (ROC)), and classification step applying k nearest neighbor (k-NN), perceptron, support vector machines (SVM), Bayesian, multiple layer perceptron (MLP) were compared from the whole stream of signal processing. Through combining such methods, the effective overview of the study indicated the accuracy of classical methods. In addition, the present study relied on a rather new feature selection described by decision tree and PCA, which is used for the BCI-SSVEP systems. Finally, the obtained accuracies were calculated based on the four recorded frequencies representing four directions including right, left, up, and down.


2018 ◽  
Vol 8 (12) ◽  
pp. 2574 ◽  
Author(s):  
Qinghua Mao ◽  
Hongwei Ma ◽  
Xuhui Zhang ◽  
Guangming Zhang

Skewness Decision Tree Support Vector Machine (SDTSVM) algorithm is widely known as a supervised learning model for multi-class classification problems. However, the classification accuracy of the SDTSVM algorithm depends on the perfect selection of its parameters and the classification order. Therefore, an improved SDTSVM (ISDTSVM) algorithm is proposed in order to improve the classification accuracy of steel cord conveyor belt defects. In the proposed model, the classification order is determined by the sum of the Euclidean distances between multi-class sample centers and the parameters are optimized by the inertia weight Particle Swarm Optimization (PSO) algorithm. In order to verify the effectiveness of the ISDTSVM algorithm with different feature space, experiments were conducted on multiple UCI (University of California Irvine) data sets and steel cord conveyor belt defects using the proposed ISDTSVM algorithm and the conventional SDTSVM algorithm respectively. The average classification accuracies of five-fold cross-validation were obtained, based on two kinds of kernel functions respectively. For the Vowel, Zoo, and Wine data sets of the UCI data sets, as well as the steel cord conveyor belt defects, the ISDTSVM algorithm improved the classification accuracy by 3%, 3%, 1% and 4% respectively, compared to the SDTSVM algorithm. The classification accuracy of the radial basis function kernel were higher than the polynomial kernel. The results indicated that the proposed ISDTSVM algorithm improved the classification accuracy significantly, compared to the conventional SDTSVM algorithm.


Sebatik ◽  
2020 ◽  
Vol 24 (2) ◽  
Author(s):  
Anifuddin Azis

Indonesia merupakan negara dengan keanekaragaman hayati terbesar kedua di dunia setelah Brazil. Indonesia memiliki sekitar 25.000 spesies tumbuhan dan 400.000 jenis hewan dan ikan. Diperkirakan 8.500 spesies ikan hidup di perairan Indonesia atau merupakan 45% dari jumlah spesies yang ada di dunia, dengan sekitar 7.000an adalah spesies ikan laut. Untuk menentukan berapa jumlah spesies tersebut dibutuhkan suatu keahlian di bidang taksonomi. Dalam pelaksanaannya mengidentifikasi suatu jenis ikan bukanlah hal yang mudah karena memerlukan suatu metode dan peralatan tertentu, juga pustaka mengenai taksonomi. Pemrosesan video atau citra pada data ekosistem perairan yang dilakukan secara otomatis mulai dikembangkan. Dalam pengembangannya, proses deteksi dan identifikasi spesies ikan menjadi suatu tantangan dibandingkan dengan deteksi dan identifikasi pada objek yang lain. Metode deep learning yang berhasil dalam melakukan klasifikasi objek pada citra mampu untuk menganalisa data secara langsung tanpa adanya ekstraksi fitur pada data secara khusus. Sistem tersebut memiliki parameter atau bobot yang berfungsi sebagai ektraksi fitur maupun sebagai pengklasifikasi. Data yang diproses menghasilkan output yang diharapkan semirip mungkin dengan data output yang sesungguhnya.  CNN merupakan arsitektur deep learning yang mampu mereduksi dimensi pada data tanpa menghilangkan ciri atau fitur pada data tersebut. Pada penelitian ini akan dikembangkan model hybrid CNN (Convolutional Neural Networks) untuk mengekstraksi fitur dan beberapa algoritma klasifikasi untuk mengidentifikasi spesies ikan. Algoritma klasifikasi yang digunakan pada penelitian ini adalah : Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree, K-Nearest Neighbor (KNN),  Random Forest, Backpropagation.


Sign in / Sign up

Export Citation Format

Share Document