Twin Support Vector Machine for Multiple Instance Learning Based on Bag Dissimilarities

In multiple instance learning (MIL) framework, an object is represented by a set of instances referred to as bag. A positive class label is assigned to a bag if it contains at least one positive instance; otherwise a bag is labeled with negative class label. Therefore, the task of MIL is to learn a classifier at bag level rather than at instance level. Traditional supervised learning approaches cannot be applied directly in such kind of situation. In this study, we represent each bag by a vector of its dissimilarities to the other existing bags in the training dataset and propose a multiple instance learning based Twin Support Vector Machine (MIL-TWSVM) classifier. We have used different ways to represent the dissimilarity between two bags and performed a comparative analysis of them. The experimental results on ten benchmark MIL datasets demonstrate that the proposed MIL-TWSVM classifier is computationally inexpensive and competitive with state-of-the-art approaches. The significance of the experimental results has been tested by using Friedman statistic and Nemenyi post hoc tests.

Download Full-text

An Improvement Of Least Square - Twin Support Vector Machine

Research and Development on Information and Communication Technology ◽

10.32913/mic-ict-research.v2021.n1.956 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Thanh Vi Nguyen ◽

Thế Cường Nguyễn

Keyword(s):

Support Vector Machine ◽

Binary Classification ◽

Least Square ◽

Experimental Results ◽

Twin Support Vector Machine ◽

Support Vector ◽

Classification Problems ◽

Training Time ◽

Data Points ◽

Better Than

n binary classification problems, two classes of data seem tobe different from each other. It is expected to be more complicated dueto the number of data points of clusters in each class also be different.Traditional algorithms as Support Vector Machine (SVM), Twin Support Vector Machine (TSVM), or Least Square Twin Support VectorMachine (LSTSVM) cannot sufficiently exploit information about thenumber of data points in each cluster of the data. Which may be effectto the accuracy of classification problems. In this paper, we proposes anew Improved Least Square - Support Vector Machine (called ILS-SVM)for binary classification problems with a class-vs-clusters strategy. Experimental results show that the ILS-SVM training time is faster thanthat of TSVM, and the ILS-SVM accuracy is better than LSTSVM andTSVM in most cases.

Download Full-text

DNS Tunneling Detection Method Based on Multilabel Support Vector Machine

Security and Communication Networks ◽

10.1155/2018/6137098 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 8

Author(s):

Ahmed Almusawi ◽

Haleh Amintoosi

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Detection Method ◽

Binary Classification ◽

Experimental Results ◽

Classification Method ◽

Support Vector ◽

Class Label ◽

Svm Classification ◽

Different Types

DNS tunneling is a method used by malicious users who intend to bypass the firewall to send or receive commands and data. This has a significant impact on revealing or releasing classified information. Several researchers have examined the use of machine learning in terms of detecting DNS tunneling. However, these studies have treated the problem of DNS tunneling as a binary classification where the class label is either legitimate or tunnel. In fact, there are different types of DNS tunneling such as FTP-DNS tunneling, HTTP-DNS tunneling, HTTPS-DNS tunneling, and POP3-DNS tunneling. Therefore, there is a vital demand to not only detect the DNS tunneling but rather classify such tunnel. This study aims to propose a multilabel support vector machine in order to detect and classify the DNS tunneling. The proposed method has been evaluated using a benchmark dataset that contains numerous DNS queries and is compared with a multilabel Bayesian classifier based on the number of corrected classified DNS tunneling instances. Experimental results demonstrate the efficacy of the proposed SVM classification method by obtaining an f-measure of 0.80.

Download Full-text

Multiple Instance Learning Based on Twin Support Vector Machine

Advances in Computer and Computational Sciences - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-10-3770-2_46 ◽

2017 ◽

pp. 497-507

Author(s):

Divya Tomar ◽

Sonali Agarwal

Keyword(s):

Support Vector Machine ◽

Multiple Instance Learning ◽

Twin Support Vector Machine ◽

Support Vector

Download Full-text

A Computational Method for the Identification of Endolysins and Autolysins

Protein and Peptide Letters ◽

10.2174/0929866526666191002104735 ◽

2020 ◽

Vol 27 (4) ◽

pp. 329-336 ◽

Cited By ~ 1

Author(s):

Lei Xu ◽

Guangmin Liang ◽

Baowen Chen ◽

Xu Tan ◽

Huaikun Xiang ◽

...

Keyword(s):

Support Vector Machine ◽

Cell Wall ◽

Experimental Results ◽

Computational Method ◽

Lytic Enzyme ◽

Support Vector ◽

Lytic Enzymes ◽

Data Set ◽

Optimal Feature ◽

Better Than

Background: Cell lytic enzyme is a kind of highly evolved protein, which can destroy the cell structure and kill the bacteria. Compared with antibiotics, cell lytic enzyme will not cause serious problem of drug resistance of pathogenic bacteria. Thus, the study of cell wall lytic enzymes aims at finding an efficient way for curing bacteria infectious. Compared with using antibiotics, the problem of drug resistance becomes more serious. Therefore, it is a good choice for curing bacterial infections by using cell lytic enzymes. Cell lytic enzyme includes endolysin and autolysin and the difference between them is the purpose of the break of cell wall. The identification of the type of cell lytic enzymes is meaningful for the study of cell wall enzymes. Objective: In this article, our motivation is to predict the type of cell lytic enzyme. Cell lytic enzyme is helpful for killing bacteria, so it is meaningful for study the type of cell lytic enzyme. However, it is time consuming to detect the type of cell lytic enzyme by experimental methods. Thus, an efficient computational method for the type of cell lytic enzyme prediction is proposed in our work. Method: We propose a computational method for the prediction of endolysin and autolysin. First, a data set containing 27 endolysins and 41 autolysins is built. Then the protein is represented by tripeptides composition. The features are selected with larger confidence degree. At last, the classifier is trained by the labeled vectors based on support vector machine. The learned classifier is used to predict the type of cell lytic enzyme. Results: Following the proposed method, the experimental results show that the overall accuracy can attain 97.06%, when 44 features are selected. Compared with Ding's method, our method improves the overall accuracy by nearly 4.5% ((97.06-92.9)/92.9%). The performance of our proposed method is stable, when the selected feature number is from 40 to 70. The overall accuracy of tripeptides optimal feature set is 94.12%, and the overall accuracy of Chou's amphiphilic PseAAC method is 76.2%. The experimental results also demonstrate that the overall accuracy is improved by nearly 18% when using the tripeptides optimal feature set. Conclusion: The paper proposed an efficient method for identifying endolysin and autolysin. In this paper, support vector machine is used to predict the type of cell lytic enzyme. The experimental results show that the overall accuracy of the proposed method is 94.12%, which is better than some existing methods. In conclusion, the selected 44 features can improve the overall accuracy for identification of the type of cell lytic enzyme. Support vector machine performs better than other classifiers when using the selected feature set on the benchmark data set.

Download Full-text

ABC-Gly: identifying protein lysine glycation sites with artificial bee colony algorithm

Current Proteomics ◽

10.2174/1570164617666191227120136 ◽

2019 ◽

Vol 17 ◽

Author(s):

Yanqiu Yao ◽

Xiaosa Zhao ◽

Qiao Ning ◽

Junping Zhou

Keyword(s):

Support Vector Machine ◽

Amino Acid ◽

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

Training Dataset ◽

Support Vector ◽

Supplementary File ◽

Feature Subset ◽

Lipid Molecule ◽

Bee Colony

Background: Glycation is a nonenzymatic post-translational modification process by attaching a sugar molecule to a protein or lipid molecule. It may impair the function and change the characteristic of the proteins which may lead to some metabolic diseases. In order to understand the underlying molecular mechanisms of glycation, computational prediction methods have been developed because of their convenience and high speed. However, a more effective computational tool is still a challenging task in computational biology. Methods: In this study, we showed an accurate identification tool named ABC-Gly for predicting lysine glycation sites. At first, we utilized three informative features, including position-specific amino acid propensity, secondary structure and the composition of k-spaced amino acid pairs to encode the peptides. Moreover, to sufficiently exploit discriminative features thus can improve the prediction and generalization ability of the model, we developed a two-step feature selection, which combined the Fisher score and an improved binary artificial bee colony algorithm based on support vector machine. Finally, based on the optimal feature subset, we constructed the effective model by using Support Vector Machine on the training dataset. Results: The performance of the proposed predictor ABC-Gly was measured with the sensitivity of 76.43%, the specificity of 91.10%, the balanced accuracy of 83.76%, the area under the receiver-operating characteristic curve (AUC) of 0.9313, a Matthew’s Correlation Coefficient (MCC) of 0.6861 by 10-fold cross-validation on training dataset, and a balanced accuracy of 59.05% on independent dataset. Compared to the state-of-the-art predictors on the training dataset, the proposed predictor achieved significant improvement in the AUC of 0.156 and MCC of 0.336. Conclusion: The detailed analysis results indicated that our predictor may serve as a powerful complementary tool to other existing methods for predicting protein lysine glycation. The source code and datasets of the ABC-Gly were provided in the Supplementary File 1.

Download Full-text

Automated Diagnosis system for detection of the pathological brain using Fast version of Simplified Pulse-Coupled Neural Network and Twin Support Vector Machine

Multimedia Tools and Applications ◽

10.1007/s11042-021-10937-6 ◽

2021 ◽

Author(s):

Ravi Shanker ◽

Mahua Bhattacharya

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Twin Support Vector Machine ◽

Support Vector ◽

Automated Diagnosis ◽

Diagnosis System ◽

Pulse Coupled Neural Network

Download Full-text

Robust truncated L$$_2$$-norm twin support vector machine

International Journal of Machine Learning and Cybernetics ◽

10.1007/s13042-021-01368-8 ◽

2021 ◽

Author(s):

Linxi Yang ◽

Guoquan Li ◽

Zhiyou Wu ◽

Changzhi Wu

Keyword(s):

Support Vector Machine ◽

Twin Support Vector Machine ◽

Support Vector

Download Full-text

A robust projection twin support vector machine with a generalized correntropy-based loss

Applied Intelligence ◽

10.1007/s10489-021-02480-6 ◽

2021 ◽

Author(s):

Qiangqiang Ren ◽

Liming Yang

Keyword(s):

Support Vector Machine ◽

Twin Support Vector Machine ◽

Support Vector

Download Full-text

Intuitionistic Fuzzy Laplacian Twin Support Vector Machine for Semi-supervised Classification

Journal of the Operations Research Society of China ◽

10.1007/s40305-021-00354-9 ◽

2021 ◽

Author(s):

Jia-Bin Zhou ◽

Yan-Qin Bai ◽

Yan-Ru Guo ◽

Hai-Xiang Lin

Keyword(s):

Support Vector Machine ◽

Negative Impact ◽

Twin Support Vector Machine ◽

Fuzzy Membership ◽

Support Vector ◽

Membership Functions ◽

Fuzzy Membership Functions ◽

Intuitionistic Fuzzy ◽

Benchmark Datasets ◽

The Impact

AbstractIn general, data contain noises which come from faulty instruments, flawed measurements or faulty communication. Learning with data in the context of classification or regression is inevitably affected by noises in the data. In order to remove or greatly reduce the impact of noises, we introduce the ideas of fuzzy membership functions and the Laplacian twin support vector machine (Lap-TSVM). A formulation of the linear intuitionistic fuzzy Laplacian twin support vector machine (IFLap-TSVM) is presented. Moreover, we extend the linear IFLap-TSVM to the nonlinear case by kernel function. The proposed IFLap-TSVM resolves the negative impact of noises and outliers by using fuzzy membership functions and is a more accurate reasonable classifier by using the geometric distribution information of labeled data and unlabeled data based on manifold regularization. Experiments with constructed artificial datasets, several UCI benchmark datasets and MNIST dataset show that the IFLap-TSVM has better classification accuracy than other state-of-the-art twin support vector machine (TSVM), intuitionistic fuzzy twin support vector machine (IFTSVM) and Lap-TSVM.

Download Full-text

Detection and Recognition of RF Devices Using Support Vector Machine

International Journal of Interdisciplinary Telecommunications and Networking ◽

10.4018/ijitn.2013100102 ◽

2013 ◽

Vol 5 (4) ◽

pp. 13-20

Author(s):

Shikhar P. Acharya ◽

Ivan G. Guardiola

Keyword(s):

Support Vector Machine ◽

Radio Frequency ◽

Experimental Results ◽

Support Vector ◽

Noise Band ◽

Detection And Identification ◽

Electromagnetic Emissions ◽

Rf Devices ◽

Unintended Electromagnetic Emissions ◽

Detection And Recognition

Radio Frequency (RF) devices produce some amount of Unintended Electromagnetic Emissions (UEEs). UEEs are generally unique to a device and can be used as a signature for the purpose of detection and identification. The problem with UEEs is that they are very low in power and are often buried deep inside the noise band. The research herein provides the application of Support Vector Machine (SVM) for detection and identification of RF devices using their UEEs. Experimental Results shows that SVM can detect RF devices within the noise band, and can also identify RF devices using their UEEs.

Download Full-text