High-Dimensional Classification

Author(s):  
Jianqing Fan ◽  
Yingying Fan ◽  
Yichao Wu
Biometrics ◽  
2010 ◽  
Vol 66 (4) ◽  
pp. 1096-1106 ◽  
Author(s):  
Song Huang ◽  
Tiejun Tong ◽  
Hongyu Zhao

2014 ◽  
Vol 5 (3) ◽  
pp. 82-96 ◽  
Author(s):  
Marijana Zekić-Sušac ◽  
Sanja Pfeifer ◽  
Nataša Šarlija

Abstract Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.


2020 ◽  
pp. 1-20
Author(s):  
Hong Chen ◽  
Changying Guo ◽  
Huijuan Xiong ◽  
Yingjie Wang

Sparse additive machines (SAMs) have attracted increasing attention in high dimensional classification due to their representation flexibility and interpretability. However, most of existing methods are formulated under Tikhonov regularization scheme with the hinge loss, which are susceptible to outliers. To circumvent this problem, we propose a sparse additive machine with ramp loss (called ramp-SAM) to tackle classification and variable selection simultaneously. Misclassification error bound is established for ramp-SAM with the help of detailed error decomposition and constructive hypothesis error analysis. To solve the nonsmooth and nonconvex ramp-SAM, a proximal block coordinate descent method is presented with convergence guarantees. The empirical effectiveness of our model is confirmed on simulated and benchmark datasets.


Symmetry ◽  
2020 ◽  
Vol 12 (11) ◽  
pp. 1782
Author(s):  
Supailin Pichai ◽  
Khamron Sunat ◽  
Sirapat Chiewchanwattana

This paper presents a method for feature selection in a high-dimensional classification context. The proposed method finds a candidate solution based on quality criteria using subset searching. In this study, the competitive swarm optimization (CSO) algorithm was implemented to solve feature selection problems in high-dimensional data. A new asymmetric chaotic function was proposed and used to generate the population and search for a CSO solution. Its histogram is right-skewed. The proposed method is named an asymmetric chaotic competitive swarm optimization algorithm (ACCSO). According to the asymmetrical property of the proposed chaotic map, ACCSO prefers zero than one. Therefore, the solution is very compact and can achieve high classification accuracy with a minimal feature subset for high-dimensional datasets. The proposed method was evaluated on 12 datasets, with dimensions ranging from 4 to 10,304. ACCSO was compared to the original CSO algorithm and other metaheuristic algorithms. Experimental results show that the proposed method can increase accuracy and it reduces the number of selected features. Compared to different optimization algorithms with other wrappers, the proposed method exhibits excellent performance.


Sign in / Sign up

Export Citation Format

Share Document