Bayesian Trigonometric Support Vector Classifier

This letter describes Bayesian techniques for support vector classification. In particular, we propose a novel differentiable loss function, called the trigonometric loss function, which has the desirable characteristic of natural normalization in the likelihood function, and then follow standard gaussian processes techniques to set up a Bayesian framework. In this framework, Bayesian inference is used to implement model adaptation, while keeping the merits of support vector classifier, such as sparseness and convex programming. This differs from standard gaussian processes for classification. Moreover, we put forward class probability in making predictions. Experimental results on benchmark data sets indicate the usefulness of this approach.

Download Full-text

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Information Discovery and Delivery ◽

10.1108/idd-09-2018-0045 ◽

2019 ◽

Vol 47 (3) ◽

pp. 154-170

Author(s):

Janani Balakumar ◽

S. Vijayarani Mohan

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Classification ◽

Support Vector ◽

Data Sets ◽

Selection Algorithm ◽

Data Set ◽

Content Type ◽

Benchmark Data ◽

Bee Colony

Purpose Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content. Design/methodology/approach This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper. Findings The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy. Originality/value This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.

Download Full-text

A Robust Regression Framework with Laplace Kernel-Induced Loss

Neural Computation ◽

10.1162/neco_a_01002 ◽

2017 ◽

Vol 29 (11) ◽

pp. 3014-3039 ◽

Cited By ~ 9

Author(s):

Liming Yang ◽

Zhuo Ren ◽

Yidan Wang ◽

Hongwei Dong

Keyword(s):

Loss Function ◽

Near Infrared ◽

Robust Regression ◽

Continuous Optimization ◽

Optimization Method ◽

Support Vector ◽

Data Sets ◽

Difference Of Convex Functions ◽

Noisy Input ◽

Regression Framework

This work proposes a robust regression framework with nonconvex loss function. Two regression formulations are presented based on the Laplace kernel-induced loss (LK-loss). Moreover, we illustrate that the LK-loss function is a nice approximation for the zero-norm. However, nonconvexity of the LK-loss makes it difficult to optimize. A continuous optimization method is developed to solve the proposed framework. The problems are formulated as DC (difference of convex functions) programming. The corresponding DC algorithms (DCAs) converge linearly. Furthermore, the proposed algorithms are applied directly to determine the hardness of licorice seeds using near-infrared spectral data with noisy input. Experiments in eight spectral regions show that the proposed methods improve generalization compared with the traditional support vector regressions (SVR), especially in high-frequency regions. Experiments on several benchmark data sets demonstrate that the proposed methods achieve better results than the traditional regression methods in most of data sets we have considered.

Download Full-text

Reduction from Cost-Sensitive Ordinal Ranking to Weighted Binary Classification

Neural Computation ◽

10.1162/neco_a_00265 ◽

2012 ◽

Vol 24 (5) ◽

pp. 1329-1367 ◽

Cited By ~ 55

Author(s):

Hsuan-Tien Lin ◽

Ling Li

Keyword(s):

Binary Classification ◽

Upper Bounds ◽

Support Vector ◽

Data Sets ◽

Binary Classifier ◽

Generalization Bounds ◽

Ordinal Ranking ◽

Benchmark Data ◽

Ranking Algorithms ◽

Ranking Performance

We present a reduction framework from ordinal ranking to binary classification. The framework consists of three steps: extracting extended examples from the original examples, learning a binary classifier on the extended examples with any binary classification algorithm, and constructing a ranker from the binary classifier. Based on the framework, we show that a weighted 0/1 loss of the binary classifier upper-bounds the mislabeling cost of the ranker, both error-wise and regret-wise. Our framework allows not only the design of good ordinal ranking algorithms based on well-tuned binary classification approaches, but also the derivation of new generalization bounds for ordinal ranking from known bounds for binary classification. In addition, our framework unifies many existing ordinal ranking algorithms, such as perceptron ranking and support vector ordinal regression. When compared empirically on benchmark data sets, some of our newly designed algorithms enjoy advantages in terms of both training speed and generalization performance over existing algorithms. In addition, the newly designed algorithms lead to better cost-sensitive ordinal ranking performance, as well as improved listwise ranking performance.

Download Full-text

A hierarchical classification scheme for computationally efficient damage classification

Proceedings of the Institution of Mechanical Engineers Part G Journal of Aerospace Engineering ◽

10.1243/09544100jaero428 ◽

2009 ◽

Vol 223 (5) ◽

pp. 497-505 ◽

Cited By ~ 3

Author(s):

C K Coelho ◽

S Das ◽

A Chattopadhyay

Keyword(s):

Matching Pursuit ◽

Classification Tree ◽

Hierarchical Classification ◽

Large Data ◽

Sensor Data ◽

Support Vector ◽

Data Sets ◽

Computationally Efficient ◽

Damage Classification ◽

Set Up

This article presents a methodology for data mining of sensor signals in a structural health monitoring (SHM) framework for damage classification using a machine-learning-based approach called support vector machines (SVMs). A hierarchical decision tree structure is constructed for damage classification and experiments were conducted on metallic and composite test specimens with surface mounted piezoelectric transducers. Damage was induced in the specimens by fatigue, impact, and tensile loading; in addition, specimens with seeded delaminations were also considered. Data were collected from the surface mounted sensors at different severities of induced damage. A matching pursuit decomposition (MPD) algorithm was used as a feature extraction technique to preprocess the sensor data and extract the input vectors used in classification. Using this binary tree framework, the computational intensity of each successive classifier is reduced and the efficiency of the algorithm as a whole is increased. The results obtained using this classification show that this type of architecture works well for large data sets because a reduced number of comparisons are required. Due to the hierarchical set-up of the classifiers, performance of the classifier as a whole is heavily dependent on the performance of the classifier at higher levels in the classification tree.

Download Full-text

Faculty Opinions recommendation of Benchmark data sets for structure-based computational target prediction.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718516631.793500133 ◽

2014 ◽

Author(s):

Vytas Bankaitis ◽

Ashutosh Tripathi

Keyword(s):

Target Prediction ◽

Data Sets ◽

Benchmark Data

Download Full-text

Identification of Candidate Genetic Markers and a Novel 4-genes Diagnostic Model in Osteoarthritis through Integrating Multiple Microarray Data

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207323666200428120310 ◽

2020 ◽

Vol 23 (8) ◽

pp. 805-813

Author(s):

Ai Jiang ◽

Peng Xu ◽

Zhenda Zhao ◽

Qizhao Tan ◽

Shang Sun ◽

...

Keyword(s):

Signaling Pathway ◽

Microarray Data ◽

Differential Expression Analysis ◽

Enrichment Analysis ◽

Mapk Signaling ◽

Functional Enrichment ◽

Joint Disease ◽

Support Vector ◽

Diagnostic Model ◽

Data Sets

Background: Osteoarthritis (OA) is a joint disease that leads to a high disability rate and a low quality of life. With the development of modern molecular biology techniques, some key genes and diagnostic markers have been reported. However, the etiology and pathogenesis of OA are still unknown. Objective: To develop a gene signature in OA. Method: In this study, five microarray data sets were integrated to conduct a comprehensive network and pathway analysis of the biological functions of OA related genes, which can provide valuable information and further explore the etiology and pathogenesis of OA. Results and Discussion: Differential expression analysis identified 180 genes with significantly expressed expression in OA. Functional enrichment analysis showed that the up-regulated genes were associated with rheumatoid arthritis (p < 0.01). Down-regulated genes regulate the biological processes of negative regulation of kinase activity and some signaling pathways such as MAPK signaling pathway (p < 0.001) and IL-17 signaling pathway (p < 0.001). In addition, the OA specific protein-protein interaction (PPI) network was constructed based on the differentially expressed genes. The analysis of network topological attributes showed that differentially upregulated VEGFA, MYC, ATF3 and JUN genes were hub genes of the network, which may influence the occurrence and development of OA through regulating cell cycle or apoptosis, and were potential biomarkers of OA. Finally, the support vector machine (SVM) method was used to establish the diagnosis model of OA, which not only had excellent predictive power in internal and external data sets (AUC > 0.9), but also had high predictive performance in different chip platforms (AUC > 0.9) and also had effective ability in blood samples (AUC > 0.8). Conclusion: The 4-genes diagnostic model may be of great help to the early diagnosis and prediction of OA.

Download Full-text

Classification of jujube defects in small data sets based on transfer learning

Neural Computing and Applications ◽

10.1007/s00521-021-05715-2 ◽

2021 ◽

Author(s):

Jianping Ju ◽

Hong Zheng ◽

Xiaohang Xu ◽

Zhongyuan Guo ◽

Zhaohui Zheng ◽

...

Keyword(s):

Transfer Learning ◽

Loss Function ◽

Training Model ◽

Parameter Distribution ◽

Test Accuracy ◽

Small Data ◽

Data Sets ◽

Data Set ◽

Small Data Sets

AbstractAlthough convolutional neural networks have achieved success in the field of image classification, there are still challenges in the field of agricultural product quality sorting such as machine vision-based jujube defects detection. The performance of jujube defect detection mainly depends on the feature extraction and the classifier used. Due to the diversity of the jujube materials and the variability of the testing environment, the traditional method of manually extracting the features often fails to meet the requirements of practical application. In this paper, a jujube sorting model in small data sets based on convolutional neural network and transfer learning is proposed to meet the actual demand of jujube defects detection. Firstly, the original images collected from the actual jujube sorting production line were pre-processed, and the data were augmented to establish a data set of five categories of jujube defects. The original CNN model is then improved by embedding the SE module and using the triplet loss function and the center loss function to replace the softmax loss function. Finally, the depth pre-training model on the ImageNet image data set was used to conduct training on the jujube defects data set, so that the parameters of the pre-training model could fit the parameter distribution of the jujube defects image, and the parameter distribution was transferred to the jujube defects data set to complete the transfer of the model and realize the detection and classification of the jujube defects. The classification results are visualized by heatmap through the analysis of classification accuracy and confusion matrix compared with the comparison models. The experimental results show that the SE-ResNet50-CL model optimizes the fine-grained classification problem of jujube defect recognition, and the test accuracy reaches 94.15%. The model has good stability and high recognition accuracy in complex environments.

Download Full-text

Pinball Loss Twin Support Vector Clustering

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3409264 ◽

2021 ◽

Vol 17 (2s) ◽

pp. 1-23

Author(s):

M. Tanveer ◽

Tarun Gupta ◽

Miten Shah ◽

Keyword(s):

Loss Function ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Structural Mri ◽

Twin Support Vector Machine ◽

Support Vector ◽

Support Vector Clustering ◽

Hinge Loss ◽

Pinball Loss ◽

Vector Clustering

Twin Support Vector Clustering (TWSVC) is a clustering algorithm inspired by the principles of Twin Support Vector Machine (TWSVM). TWSVC has already outperformed other traditional plane based clustering algorithms. However, TWSVC uses hinge loss, which maximizes shortest distance between clusters and hence suffers from noise-sensitivity and low re-sampling stability. In this article, we propose Pinball loss Twin Support Vector Clustering (pinTSVC) as a clustering algorithm. The proposed pinTSVC model incorporates the pinball loss function in the plane clustering formulation. Pinball loss function introduces favorable properties such as noise-insensitivity and re-sampling stability. The time complexity of the proposed pinTSVC remains equivalent to that of TWSVC. Extensive numerical experiments on noise-corrupted benchmark UCI and artificial datasets have been provided. Results of the proposed pinTSVC model are compared with TWSVC, Twin Bounded Support Vector Clustering (TBSVC) and Fuzzy c-means clustering (FCM). Detailed and exhaustive comparisons demonstrate the better performance and generalization of the proposed pinTSVC for noise-corrupted datasets. Further experiments and analysis on the performance of the above-mentioned clustering algorithms on structural MRI (sMRI) images taken from the ADNI database, face clustering, and facial expression clustering have been done to demonstrate the effectiveness and feasibility of the proposed pinTSVC model.

Download Full-text