STRUCTURE-EMBEDDED AUC-SVM

AUC-SVM directly maximizes the area under the ROC curve (AUC) through minimizing its hinge loss relaxation, and the decision function is determined by those support vector sample pairs playing the same roles as the support vector samples in SVM. Such a learning paradigm generally emphasizes more on the local discriminative information just associated with these support vectors whereas hardly takes the overall view of data into account, thereby it may incur loss of the global distribution information in data favorable for classification. Moreover, due to the high computational complexity of AUC-SVM induced by the large number of training sample pairs quadratic in the number of samples, sampling is usually adopted, incurring a further loss of the distribution information in data. In order to compensate the distribution information loss and simultaneously boost the AUC-SVM performance, in this paper, we develop a novel structure-embedded AUC-SVM (SAUC-SVM for short) through embedding the global structure information in the whole data into AUC-SVM. With such an embedding, the proposed SAUC-SVM incorporates the local discriminative information and global structure information in data into a uniform formulation and consequently guarantees better generalization performance. Comparative experiments on both synthetic and real datasets confirm its effectiveness.

Download Full-text

Prediction of Antisense Oligonucleotide Efficacy Using Local and Global Structure Information with Support Vector Machines

2006 5th International Conference on Machine Learning and Applications (ICMLA'06) ◽

10.1109/icmla.2006.39 ◽

2006 ◽

Author(s):

Roger Craig ◽

Li Liao

Keyword(s):

Support Vector Machines ◽

Antisense Oligonucleotide ◽

Global Structure ◽

Support Vector ◽

Structure Information ◽

Vector Machines

Download Full-text

KERNEL METHODS FOR CLUSTERING: COMPETITIVE LEARNING AND c-MEANS

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488506004138 ◽

2006 ◽

Vol 14 (04) ◽

pp. 481-493 ◽

Cited By ~ 4

Author(s):

RYO INOKUCHI ◽

SADAAKI MIYAMOTO

Keyword(s):

Computational Complexity ◽

Kernel Methods ◽

Clustering Algorithms ◽

Competitive Learning ◽

Machine Learning Algorithms ◽

Support Vector ◽

Alternative Formulation ◽

Self Organizing Map ◽

Data Set ◽

High Computational Complexity

Recently kernel methods in support vector machines have widely been used in machine learning algorithms to obtain nonlinear models. Clustering is an unsupervised learning method which divides whole data set into subgroups, and popular clustering algorithms such as c-means are employing kernel methods. Other kernel-based clustering algorithms have been inspired from kernel c-means. However, the formulation of kernel c-means has a high computational complexity. This paper gives an alternative formulation of kernel-based clustering algorithms derived from competitive learning clustering. This new formulation obviously uses sequential updating or on-line learning to avoid high computational complexity. We apply kernel methods to related algorithms: learning vector quantization and self-organizing map. We moreover consider kernel methods for sequential c-means and its fuzzy version by the proposed formulation.

Download Full-text

Distributed Kernel Extreme Learning Machines for Aircraft Engine Failure Diagnostics

Applied Sciences ◽

10.3390/app9081707 ◽

2019 ◽

Vol 9 (8) ◽

pp. 1707 ◽

Cited By ~ 2

Author(s):

Junjie Lu ◽

Jinquan Huang ◽

Feng Lu

Keyword(s):

Computational Complexity ◽

Real Time ◽

Evidence Theory ◽

Aircraft Engine ◽

Training Sample ◽

Performance Estimation ◽

Support Vector ◽

Extreme Learning Machines ◽

The Real ◽

Learning Machines

Kernel extreme learning machine (KELM) has been widely studied in the field of aircraft engine fault diagnostics due to its easy implementation. However, because its computational complexity is proportional to the training sample size, its application in time-sensitive scenarios is limited. Therefore, in the case of largescale samples, the original KELM is difficult to meet the real-time requirements of aircraft engine onboard condition. To address this shortcoming, a novel distributed kernel extreme learning machines (DKELMs) algorithm is proposed in this paper. The distributed subnetwork is adopted to reduce the computational complexity, and then the likelihood probability and Dempster-Shafer (DS) evidence theory is used to design the fusion scheme to ensure the accuracy after fusion is not reduced. Afterwards, the verification on the benchmark datasets shows that the algorithm can greatly reduce the computational complexity and improve the real-time performance of the original KELM algorithm without sacrificing the accuracy of the model. Finally, the performance estimation and fault pattern recognition experiments of an aircraft engine show that, compared with the original KELM algorithm and support vector machine (SVM) algorithm, the proposed algorithm has the best performance considering both real-time capability and model accuracy.

Download Full-text

A Learning Framework of Nonparallel Hyperplanes Classifier

The Scientific World JOURNAL ◽

10.1155/2015/497617 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12

Author(s):

Zhi-Xia Yang ◽

Yuan-Hai Shao ◽

Yao-Lin Jiang

Keyword(s):

Binary Classification ◽

Computational Cost ◽

Classification Problem ◽

Multiclass Classification ◽

Decision Function ◽

Support Vector ◽

Learning Framework ◽

Hinge Loss ◽

Benchmark Datasets ◽

Nonparallel Hyperplanes

A novel learning framework of nonparallel hyperplanes support vector machines (NPSVMs) is proposed for binary classification and multiclass classification. This framework not only includes twin SVM (TWSVM) and its many deformation versions but also extends them into multiclass classification problem when different parameters or loss functions are chosen. Concretely, we discuss the linear and nonlinear cases of the framework, in which we select the hinge loss function as example. Moreover, we also give the primal problems of several extension versions of TWSVM’s deformation versions. It is worth mentioning that, in the decision function, the Euclidean distance is replaced by the absolute value|wTx+b|, which keeps the consistency between the decision function and the optimization problem and reduces the computational cost particularly when the kernel function is introduced. The numerical experiments on several artificial and benchmark datasets indicate that our framework is not only fast but also shows good generalization.

Download Full-text

Robust relative margin support vector machines

Journal of Algorithms & Computational Technology ◽

10.1177/1748301816680503 ◽

2016 ◽

Vol 11 (2) ◽

pp. 186-191 ◽

Cited By ~ 1

Author(s):

Yunyan Song ◽

Wenxin Zhu ◽

Yingyuan Xiao ◽

Ping Zhong

Keyword(s):

Computational Complexity ◽

Binary Classification ◽

Support Vector ◽

Decision Boundary ◽

Large Margin ◽

Hinge Loss ◽

Shortest Distance ◽

Vector Machines ◽

Pinball Loss ◽

Real World Problems

Recently, a class of classifiers, called relative margin machine, has been developed. Relative margin machine has shown significant improvements over the large margin counterparts on real-world problems. In binary classification, the most widely used loss function is the hinge loss, which results in the hinge loss relative margin machine. The hinge loss relative margin machine is sensitive to outliers. In this article, we proposed to change maximizing the shortest distance used in relative margin machine into maximizing the quantile distance, the pinball loss which is related to quantiles was used in classification. The proposed method is less sensitive to noise, especially the feature noise around the decision boundary. Meanwhile, the computational complexity of the proposed method is similar to that of the relative margin machine.

Download Full-text

FUZZY TRANSDUCTIVE SUPPORT VECTOR MACHINES FOR HYPERTEXT CLASSIFICATION

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s021848850400262x ◽

2004 ◽

Vol 12 (01) ◽

pp. 21-36 ◽

Cited By ~ 5

Author(s):

HONG LIU ◽

SHANG-TENG HUANG

Keyword(s):

Support Vector Machines ◽

Classification Performance ◽

Decision Function ◽

Experimental Results ◽

Support Vector ◽

Structure Information ◽

Plain Text ◽

Vector Machines ◽

Text Information ◽

Fuzzy Labels

A method to assign fuzzy labels to unlabeled hypertext documents based on hyperlink structure information is first proposed. Then, the construction of the fuzzy transductive support vector machines is described. Also, an algorithm to train the fuzzy transductive support vector machines is presented. While in the transductive support vector machines all the test examples are treated equally, in the fuzzy transductive support vector machines, test examples are treated discriminatively according to their fuzzy labels, hence a more reliable decision function. Experimental results on the WebKB corpus show that, by fusing the plain text information and the hyperlink structure information, much better classification performance can be achieved.

Download Full-text

Influence maximization based on partial network structure information: A comparative analysis on seed selection heuristics

International Journal of Modern Physics C ◽

10.1142/s0129183117501224 ◽

2017 ◽

Vol 28 (10) ◽

pp. 1750122 ◽

Cited By ~ 1

Author(s):

Şirag Erkol ◽

Gönenç Yücel

Keyword(s):

Computational Complexity ◽

Simulation Model ◽

Optimization Problem ◽

Partial Information ◽

Complete Information ◽

Seed Selection ◽

Structure Information ◽

Network Information ◽

The Times ◽

High Computational Complexity

In this study, the problem of seed selection is investigated. This problem is mainly treated as an optimization problem, which is proved to be NP-hard. There are several heuristic approaches in the literature which mostly use algorithmic heuristics. These approaches mainly focus on the trade-off between computational complexity and accuracy. Although the accuracy of algorithmic heuristics are high, they also have high computational complexity. Furthermore, in the literature, it is generally assumed that complete information on the structure and features of a network is available, which is not the case in most of the times. For the study, a simulation model is constructed, which is capable of creating networks, performing seed selection heuristics, and simulating diffusion models. Novel metric-based seed selection heuristics that rely only on partial information are proposed and tested using the simulation model. These heuristics use local information available from nodes in the synthetically created networks. The performances of heuristics are comparatively analyzed on three different network types. The results clearly show that the performance of a heuristic depends on the structure of a network. A heuristic to be used should be selected after investigating the properties of the network at hand. More importantly, the approach of partial information provided promising results. In certain cases, selection heuristics that rely only on partial network information perform very close to similar heuristics that require complete network data.

Download Full-text

Pinball Loss Twin Support Vector Clustering

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3409264 ◽

2021 ◽

Vol 17 (2s) ◽

pp. 1-23

Author(s):

M. Tanveer ◽

Tarun Gupta ◽

Miten Shah ◽

Keyword(s):

Loss Function ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Structural Mri ◽

Twin Support Vector Machine ◽

Support Vector ◽

Support Vector Clustering ◽

Hinge Loss ◽

Pinball Loss ◽

Vector Clustering

Twin Support Vector Clustering (TWSVC) is a clustering algorithm inspired by the principles of Twin Support Vector Machine (TWSVM). TWSVC has already outperformed other traditional plane based clustering algorithms. However, TWSVC uses hinge loss, which maximizes shortest distance between clusters and hence suffers from noise-sensitivity and low re-sampling stability. In this article, we propose Pinball loss Twin Support Vector Clustering (pinTSVC) as a clustering algorithm. The proposed pinTSVC model incorporates the pinball loss function in the plane clustering formulation. Pinball loss function introduces favorable properties such as noise-insensitivity and re-sampling stability. The time complexity of the proposed pinTSVC remains equivalent to that of TWSVC. Extensive numerical experiments on noise-corrupted benchmark UCI and artificial datasets have been provided. Results of the proposed pinTSVC model are compared with TWSVC, Twin Bounded Support Vector Clustering (TBSVC) and Fuzzy c-means clustering (FCM). Detailed and exhaustive comparisons demonstrate the better performance and generalization of the proposed pinTSVC for noise-corrupted datasets. Further experiments and analysis on the performance of the above-mentioned clustering algorithms on structural MRI (sMRI) images taken from the ADNI database, face clustering, and facial expression clustering have been done to demonstrate the effectiveness and feasibility of the proposed pinTSVC model.

Download Full-text

Lightweight Anomaly Detection Scheme Using Incremental Principal Component Analysis and Support Vector Machine

Sensors ◽

10.3390/s21238017 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8017

Author(s):

Nurfazrina M. Zamry ◽

Anazida Zainal ◽

Murad A. Rassam ◽

Eman H. Alkhammash ◽

Fuad A. Ghaleb ◽

...

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Sensor Networks ◽

Computational Complexity ◽

Anomaly Detection ◽

Principal Component ◽

Support Vector ◽

Communication Overhead ◽

Detection Scheme ◽

Memory Utilization

Wireless Sensors Networks have been the focus of significant attention from research and development due to their applications of collecting data from various fields such as smart cities, power grids, transportation systems, medical sectors, military, and rural areas. Accurate and reliable measurements for insightful data analysis and decision-making are the ultimate goals of sensor networks for critical domains. However, the raw data collected by WSNs usually are not reliable and inaccurate due to the imperfect nature of WSNs. Identifying misbehaviours or anomalies in the network is important for providing reliable and secure functioning of the network. However, due to resource constraints, a lightweight detection scheme is a major design challenge in sensor networks. This paper aims at designing and developing a lightweight anomaly detection scheme to improve efficiency in terms of reducing the computational complexity and communication and improving memory utilization overhead while maintaining high accuracy. To achieve this aim, one-class learning and dimension reduction concepts were used in the design. The One-Class Support Vector Machine (OCSVM) with hyper-ellipsoid variance was used for anomaly detection due to its advantage in classifying unlabelled and multivariate data. Various One-Class Support Vector Machine formulations have been investigated and Centred-Ellipsoid has been adopted in this study due to its effectiveness. Centred-Ellipsoid is the most effective kernel among studies formulations. To decrease the computational complexity and improve memory utilization, the dimensions of the data were reduced using the Candid Covariance-Free Incremental Principal Component Analysis (CCIPCA) algorithm. Extensive experiments were conducted to evaluate the proposed lightweight anomaly detection scheme. Results in terms of detection accuracy, memory utilization, computational complexity, and communication overhead show that the proposed scheme is effective and efficient compared few existing schemes evaluated. The proposed anomaly detection scheme achieved the accuracy higher than 98%, with (𝑛𝑑) memory utilization and no communication overhead.

Download Full-text