A New SVM Multi-Class Classification Algorithm Based on Sample Scale and Distribution Area

A improved binary tree SVM multi-class classification algorithm is proposed. Firstly, constructing the minimum hyper ellipsoid for each class sample in the feather space, and then generating optimal binary tree according to the hyper ellipsoid volume, training sub-classifier for every non-leaf node in the binary tree at the same time. For the sample to be classified, the sub-classifiers are used from the root node until one leaf node, and the corresponding class of the leaf node is the class of the sample. The experiments are done on the Statlog database, and the experimental results show that the algorithm improves classification precision and classification speed, especially in the situation that the number of class are more and their distribution area are equal approximately, the algorithm can greatly improve the classification precision and classification speed.

Download Full-text

Color Quantization Based on Hierarchical Frequency Sensitive Competitive Learning

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2010.p0375 ◽

2010 ◽

Vol 14 (4) ◽

pp. 375-381

Author(s):

Jun Zhang ◽

◽

Jinglu Hu

Keyword(s):

Binary Tree ◽

Competitive Learning ◽

Tree Structure ◽

Experimental Results ◽

Color Quantization ◽

Adaptive Procedure ◽

Root Node ◽

Binary Tree Structure

In this paper, we propose a Hierarchical Frequency Sensitive Competitive Learning (HFSCL) method to achieve Color Quantization (CQ). In HFSCL, the appropriate number of quantized colors and the palette can be obtained by an adaptive procedure following a binary tree structure with nodes and layers. Starting from the root node that contains all colors in an image until all nodes are examined by split conditions, a binary tree will be generated. In each node of the tree, a Frequency Sensitive Competitive Learning (FSCL) network is used to achieve two-way division. To avoid over-split, merging condition is defined to merge the clusters that are close enough to each other at each layer. Experimental results show that the proposed HFSCL has desired ability for CQ.

Download Full-text

Classification Algorithm for Naïve Bayes Based on Validity and Correlation

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.303-306.1609 ◽

2013 ◽

Vol 303-306 ◽

pp. 1609-1612

Author(s):

Huai Lin Dong ◽

Xiao Dan Zhu ◽

Qing Feng Wu ◽

Juan Juan Huang

Keyword(s):

Classification Accuracy ◽

Naive Bayes ◽

Classification Performance ◽

Naïve Bayes ◽

Classification Algorithm ◽

Experimental Results ◽

Training Data ◽

Improved Method ◽

Naive Bayes Classification ◽

The One

Naïve Bayes classification algorithm based on validity (NBCABV) optimizes the training data by eliminating the noise samples of training data with validity to improve the effect of classification, while it ignores the associations of properties. In consideration of the associations of properties, an improved method that is classification algorithm for Naïve Bayes based on validity and correlation (CANBBVC) is proposed to delete more noise samples with validity and correlation, thus resulting in better classification performance. Experimental results show this model has higher classification accuracy comparing the one based on validity solely.

Download Full-text

Evolutionary Computation for Feature Selection in Classification

10.26686/wgtn.17134145 ◽

2021 ◽

Author(s):

◽

Hoai Nguyen

Keyword(s):

Feature Selection ◽

Selection Process ◽

Fitness Function ◽

Pso Algorithm ◽

Surrogate Models ◽

Classification Performance ◽

Classification Algorithm ◽

Experimental Results ◽

Multi Objective ◽

Binary Pso

<p>Classification aims to identify a class label of an instance according to the information from its characteristics or features. Unfortunately, many classification problems have a large feature set containing irrelevant and redundant features, which reduce the classification performance. In order to address the above problem, feature selection is proposed to select a small subset of relevant features. There are three main types of feature selection methods, i.e. wrapper, embedded and filter approaches. Wrappers use a classification algorithm to evaluate candidate feature subsets. In embedded approaches, the selection process is embedded in the training process of a classification algorithm. Different from the other two approaches, filters do not involve any classification algorithm during the selection process. Feature selection is an important process but it is not an easy task due to its large search space and complex feature interactions. Because of the potential global search ability, Evolutionary Computation (EC), especially Particle Swarm Optimization (PSO), has been widely and successfully applied to feature selection. However, there is potential to improve the effectiveness and efficiency of EC-based feature selection. The overall goal of this thesis is to investigate and improve the capability of EC for feature selection to select small feature subsets while maintaining or even improving the classification performance compared to using all features. Different aspects of feature selection are considered in this thesis such as the number of objectives (single-objective/multi-objective), the fitness function (filter/wrapper), and the searching mechanism. This thesis introduces a new fitness function based on mutual information which is calculated by an estimation approach instead of the traditional counting approach. Results show that the estimation approach works well on both continuous and discrete data. More importantly, mutual information calculated by the estimation approach can capture feature interactions better than the traditional counting approach. This thesis develops a novel binary PSO algorithm, which is the first work to redefine some core concepts of PSO such as velocity and momentum to suit the characteristics of binary search spaces. Experimental results show that the proposed binary PSO algorithm evolve better solutions than other binary EC algorithms when the search spaces are large and complex. Specifically, on feature selection, the proposed binary PSO algorithm can select smaller feature subsets with similar or better classification accuracies, especially when there are a large number of features. This thesis proposes surrogate models for wrapper-based feature selection. The surrogate models use surrogate training sets which are subsets of informative instances selected from the training set. Experimental results show that the proposed surrogate models assist PSO to reduce the computational cost while maintaining or even improving the classification performance compared to using only the original training set. The thesis develops the first wrapper-based multi-objective feature selection algorithm using MOEA/D. A new decomposition strategy using multiple reference points for MOEA/D is designed, which can deal with different characteristics of multi-objective feature selection such as highly discontinuous Pareto fronts and complex relationships between objectives. The experimental results show that the proposed algorithm can evolve more diverse non-dominated sets than other multi-objective algorithms. This thesis introduces the first PSO-based feature selection algorithm for transfer learning. In the proposed algorithm, the fitness function uses classification performance to reduce the differences between domains while maintaining the discriminative ability on the target domain. The experimental results show that the proposed algorithm can select feature subsets which achieve better classification performance than four state-of-the-art feature-based transfer learning algorithms.</p>

Download Full-text

Evolutionary Computation for Feature Selection in Classification

10.26686/wgtn.17134145.v1 ◽

2021 ◽

Author(s):

◽

Hoai Nguyen

Keyword(s):

Feature Selection ◽

Selection Process ◽

Fitness Function ◽

Pso Algorithm ◽

Surrogate Models ◽

Classification Performance ◽

Classification Algorithm ◽

Experimental Results ◽

Multi Objective ◽

Binary Pso

<p>Classification aims to identify a class label of an instance according to the information from its characteristics or features. Unfortunately, many classification problems have a large feature set containing irrelevant and redundant features, which reduce the classification performance. In order to address the above problem, feature selection is proposed to select a small subset of relevant features. There are three main types of feature selection methods, i.e. wrapper, embedded and filter approaches. Wrappers use a classification algorithm to evaluate candidate feature subsets. In embedded approaches, the selection process is embedded in the training process of a classification algorithm. Different from the other two approaches, filters do not involve any classification algorithm during the selection process. Feature selection is an important process but it is not an easy task due to its large search space and complex feature interactions. Because of the potential global search ability, Evolutionary Computation (EC), especially Particle Swarm Optimization (PSO), has been widely and successfully applied to feature selection. However, there is potential to improve the effectiveness and efficiency of EC-based feature selection. The overall goal of this thesis is to investigate and improve the capability of EC for feature selection to select small feature subsets while maintaining or even improving the classification performance compared to using all features. Different aspects of feature selection are considered in this thesis such as the number of objectives (single-objective/multi-objective), the fitness function (filter/wrapper), and the searching mechanism. This thesis introduces a new fitness function based on mutual information which is calculated by an estimation approach instead of the traditional counting approach. Results show that the estimation approach works well on both continuous and discrete data. More importantly, mutual information calculated by the estimation approach can capture feature interactions better than the traditional counting approach. This thesis develops a novel binary PSO algorithm, which is the first work to redefine some core concepts of PSO such as velocity and momentum to suit the characteristics of binary search spaces. Experimental results show that the proposed binary PSO algorithm evolve better solutions than other binary EC algorithms when the search spaces are large and complex. Specifically, on feature selection, the proposed binary PSO algorithm can select smaller feature subsets with similar or better classification accuracies, especially when there are a large number of features. This thesis proposes surrogate models for wrapper-based feature selection. The surrogate models use surrogate training sets which are subsets of informative instances selected from the training set. Experimental results show that the proposed surrogate models assist PSO to reduce the computational cost while maintaining or even improving the classification performance compared to using only the original training set. The thesis develops the first wrapper-based multi-objective feature selection algorithm using MOEA/D. A new decomposition strategy using multiple reference points for MOEA/D is designed, which can deal with different characteristics of multi-objective feature selection such as highly discontinuous Pareto fronts and complex relationships between objectives. The experimental results show that the proposed algorithm can evolve more diverse non-dominated sets than other multi-objective algorithms. This thesis introduces the first PSO-based feature selection algorithm for transfer learning. In the proposed algorithm, the fitness function uses classification performance to reduce the differences between domains while maintaining the discriminative ability on the target domain. The experimental results show that the proposed algorithm can select feature subsets which achieve better classification performance than four state-of-the-art feature-based transfer learning algorithms.</p>

Download Full-text

Application of Binary Tree Multi-class Classification Algorithm Based on SVM in Shift Decision for Engineering Vehicle

2007 IEEE International Conference on Control and Automation ◽

10.1109/icca.2007.4376678 ◽

2007 ◽

Author(s):

Shunjie Han ◽

Wen You ◽

Hui Li

Keyword(s):

Binary Tree ◽

Classification Algorithm ◽

Multi Class Classification

Download Full-text

AR-Tri-Training: Tri-Training with Assistant Strategy

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.513-517.1840 ◽

2014 ◽

Vol 513-517 ◽

pp. 1840-1844 ◽

Cited By ~ 1

Author(s):

Long Jie Cui ◽

Hong Li Wang ◽

Rong Yi Cui

Keyword(s):

Learning Strategy ◽

Voice Recognition ◽

Classification Performance ◽

Experimental Results ◽

Training Algorithm ◽

Information Strategy ◽

Testing Rate ◽

Rich Information ◽

Validation Set

The classification performance of the classifier is weakened because the noise samples are introduced for the use of unlabeled samples in Tri-training. In this paper a new Tri-training style algorithm named AR-Tri-training (Tri-training with assistant and rich strategy) is proposed. Firstly, the assistant learning strategy is posed. Then the supporting learner is designed by combining the assistant learning strategy with rich information strategy. The number of mislabeled samples produced in the iterations of three classifiers mutually labeling are reduced by use of the supporting learner, moreover the unlabeled samples and the misclassified samples of validation set can be fully used. The proposed algorithm is applied to voice recognition. The experimental results show that AR-Tri-training algorithm can compensate for the shortcomings of Tri-training algorithm, further improve the testing rate.

Download Full-text

A Generic Algorithm to Determine Maximum Bottleneck Node Weight-based Data Gathering Trees for Wireless Sensor Networks

Network Protocols and Algorithms ◽

10.5296/npa.v7i3.7961 ◽

2015 ◽

Vol 7 (3) ◽

pp. 18 ◽

Cited By ~ 7

Author(s):

Natarajan Meghanathan

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Data Gathering ◽

Leaf Node ◽

Wireless Sensor ◽

Generic Algorithm ◽

Intermediate Node ◽

Root Node ◽

Bottleneck Node ◽

Node Weight

We propose a generic algorithm to determine maximum bottleneck node weight-based data gathering (MaxBNW-DG) trees for wireless sensor networks (WSNs) and compare the performance of the MaxBNW-DG trees with those of maximum and minimum link weight-based data gathering trees (MaxLW-DG and MinLW-DG trees). Assuming each node in a WSN graph has a weight, the bottleneck weight for the path from a node u to the root node of the DG tree is the minimum of the node weights on the path (inclusive of the weights of the end nodes). The MaxBNW-DG tree algorithm determines a DG tree such that each node has a path of the largest bottleneck weight to the root node. We observe the MaxBNW-DG trees to incur lower height, larger percentage of nodes as leaf nodes and a larger weight per intermediate node compared to the leaf node; the tradeoff being a larger a network-wide data aggregation delay due to larger number of child nodes per intermediate node. The MaxBNW-DG algorithm could be used to determine DG trees with larger trust score, larger energy (and other such criterion for node weight) per intermediate node compared to the leaf node.

Download Full-text

Automatic Ship Routing with High Reliability and Efficiency between Two Arbitrary Points at Sea

Journal of Navigation ◽

10.1017/s0373463318000814 ◽

2018 ◽

Vol 72 (2) ◽

pp. 430-446

Author(s):

Shuaidong Jia ◽

Zeyuan Dai ◽

Lihua Zhang

Keyword(s):

Binary Tree ◽

High Reliability ◽

Spatial Database ◽

Experimental Results ◽

Ship Routing ◽

Shortest Distance ◽

Tree Index ◽

Tree Method

Due to the limitations of the existing methods (for example, the route binary tree method) that can only automatically generate routes based on a single chart, a method for automatically generating the shortest distance route based on an obstacle spatial database is proposed. Using this proposed method, the route between two arbitrary points at sea can be automatically generated. First, the differences in accuracy and updating time of charts are quantitatively analysed. Next, the mechanism for updating obstacles is designed, an obstacle spatial database is constructed, and the obstacle data extracted from multiple charts are fused. Finally, considering the effect of efficiency on the amount of obstacle data, a route window and an improved R-tree index are designed for quickly extracting and querying the obstacle database. The experimental results demonstrate that compared with existing methods, the proposed method can generate the shortest distance between two arbitrary points at sea and eliminates the limitation of the area of the chart. In addition, with data from multiple charts, the route generated by the proposed method is more reliable than that of the existing methods, and it is more efficient.

Download Full-text

Efficient pan-cancer whole-slide image classification and outlier detection using convolutional neural networks

10.1101/633123 ◽

2019 ◽

Cited By ~ 1

Author(s):

Seda Bilaloglu ◽

Joyce Wu ◽

Eduardo Fierro ◽

Raul Delgado Sanchez ◽

Paolo Santiago Ocampo ◽

...

Keyword(s):

Visual Analysis ◽

Classification Problem ◽

Classification Performance ◽

Neoplastic Tissue ◽

Multiple Tumor ◽

Slide Image ◽

Prediction Systems ◽

Multi Class Classification ◽

The Many ◽

Whole Slide Images

AbstractVisual analysis of solid tissue mounted on glass slides is currently the primary method used by pathologists for determining the stage, type and subtypes of cancer. Although whole slide images are usually large (10s to 100s thousands pixels wide), an exhaustive though time-consuming assessment is necessary to reduce the risk of misdiagnosis. In an effort to address the many diagnostic challenges faced by trained experts, recent research has been focused on developing automatic prediction systems for this multi-class classification problem. Typically, complex convolutional neural network (CNN) architectures, such as Google’s Inception, are used to tackle this problem. Here, we introduce a greatly simplified CNN architecture, PathCNN, which allows for more efficient use of computational resources and better classification performance. Using this improved architecture, we trained simultaneously on whole-slide images from multiple tumor sites and corresponding non-neoplastic tissue. Dimensionality reduction analysis of the weights of the last layer of the network capture groups of images that faithfully represent the different types of cancer, highlighting at the same time differences in staining and capturing outliers, artifacts and misclassification errors. Our code is available online at: https://github.com/sedab/PathCNN.

Download Full-text