scholarly journals A Parallel Mixture of SVMs for Very Large Scale Problems

2002 ◽  
Vol 14 (5) ◽  
pp. 1105-1114 ◽  
Author(s):  
Ronan Collobert ◽  
Samy Bengio ◽  
Yoshua Bengio

Support vector machines (SVMs) are the state-of-the-art models for many classification problems, but they suffer from the complexity of their training algorithm, which is at least quadratic with respect to the number of examples. Hence, it is hopeless to try to solve real-life problems having more than a few hundred thousand examples with SVMs. This article proposes a new mixture of SVMs that can be easily implemented in parallel and where each SVM is trained on a small subset of the whole data set. Experiments on a large benchmark data set (Forest) yielded significant time improvement (time complexity appears empirically to locally grow linearly with the number of examples). In addition, and surprisingly, a significant improvement in generalization was observed.

Author(s):  
Hesham M. Al-Ammal

Detection of anomalies in a given data set is a vital step in several applications in cybersecurity; including intrusion detection, fraud, and social network analysis. Many of these techniques detect anomalies by examining graph-based data. Analyzing graphs makes it possible to capture relationships, communities, as well as anomalies. The advantage of using graphs is that many real-life situations can be easily modeled by a graph that captures their structure and inter-dependencies. Although anomaly detection in graphs dates back to the 1990s, recent advances in research utilized machine learning methods for anomaly detection over graphs. This chapter will concentrate on static graphs (both labeled and unlabeled), and the chapter summarizes some of these recent studies in machine learning for anomaly detection in graphs. This includes methods such as support vector machines, neural networks, generative neural networks, and deep learning methods. The chapter will reflect the success and challenges of using these methods in the context of graph-based anomaly detection.


Author(s):  
Onur Seref ◽  
O. Erhun Kundakcioglu ◽  
Michael Bewernitz

The underlying optimization problem for the maximal margin classifier is only feasible if the two classes of pattern vectors are linearly separable. However, most of the real life classification problems are not linearly separable. Nevertheless, the maximal margin classifier encompasses the fundamental methods used in standard SVM classifiers. The solution to the optimization problem in the maximal margin classifier minimizes the bound on the generalization error (Vapnik, 1998). The basic premise of this method lies in the minimization of a convex optimization problem with linear inequality constraints, which can be solved efficiently by many alternative methods (Bennett & Campbell, 2000).


2018 ◽  
Vol 462 ◽  
pp. 114-131 ◽  
Author(s):  
Zhen Wang ◽  
Yuan-Hai Shao ◽  
Lan Bai ◽  
Chun-Na Li ◽  
Li-Ming Liu ◽  
...  

2011 ◽  
Vol 216 ◽  
pp. 738-741
Author(s):  
Yue E Chen ◽  
Bai Li Ren

SVM has got very good results in the area of solving the classification, regression and density estimation problem in machine learning, has been successfully applied to practical problems of text recognition, speech classification, but the training time is too long is a big drawback. A new reduction strategy is proposed for training support vector machines. This method is fast in convergence without learning machine’s generalization performance, the results of simulation experiments show the feasibility and effectiveness of that method through this method.


2004 ◽  
Vol 16 (7) ◽  
pp. 1345-1351 ◽  
Author(s):  
Xiaomei Liu ◽  
Lawrence O. Hall ◽  
Kevin W. Bowyer

Collobert, Bengio, and Bengio (2002) recently introduced a novel approach to using a neural network to provide a class prediction from an ensemble of support vector machines (SVMs). This approach has the advantage that the required computation scales well to very large data sets. Experiments on the Forest Cover data set show that this parallel mixture is more accurate than a single SVM, with 90.72% accuracy reported on an independent test set. Although this accuracy is impressive, their article does not consider alternative types of classifiers. We show that a simple ensemble of decision trees results in a higher accuracy, 94.75%, and is computationally efficient. This result is somewhat surprising and illustrates the general value of experimental comparisons using different types of classifiers.


2021 ◽  
Author(s):  
M. Tanveer ◽  
A. Tiwari ◽  
R. Choudhary ◽  
M. A. Ganaie

2013 ◽  
Vol 2013 ◽  
pp. 1-6 ◽  
Author(s):  
Ersen Yılmaz

An expert system having two stages is proposed for cardiac arrhythmia diagnosis. In the first stage, Fisher score is used for feature selection to reduce the feature space dimension of a data set. The second stage is classification stage in which least squares support vector machines classifier is performed by using the feature subset selected in the first stage to diagnose cardiac arrhythmia. Performance of the proposed expert system is evaluated by using an arrhythmia data set which is taken from UCI machine learning repository.


2017 ◽  
Vol 9 (3) ◽  
pp. 334-339
Author(s):  
Rokas Semėnas

Face recognition programs have many practical usages in various fields, such as security or entertainment. Existing recognition algorithms must deal with various real life problems – mainly with illumination. In practice, illumination normalization models are often used only for Small-scale futures extraction, ignoring Large-scale features. In this article, new and more direct approach to this problem is offered, used algorithms and test results are given.


2016 ◽  
Vol 2016 ◽  
pp. 1-9 ◽  
Author(s):  
Abbas Akkasi ◽  
Ekrem Varoğlu ◽  
Nazife Dimililer

Named Entity Recognition (NER) from text constitutes the first step in many text mining applications. The most important preliminary step for NER systems using machine learning approaches is tokenization where raw text is segmented into tokens. This study proposes an enhanced rule based tokenizer, ChemTok, which utilizes rules extracted mainly from the train data set. The main novelty of ChemTok is the use of the extracted rules in order to merge the tokens split in the previous steps, thus producing longer and more discriminative tokens. ChemTok is compared to the tokenization methods utilized by ChemSpot and tmChem. Support Vector Machines and Conditional Random Fields are employed as the learning algorithms. The experimental results show that the classifiers trained on the output of ChemTok outperforms all classifiers trained on the output of the other two tokenizers in terms of classification performance, and the number of incorrectly segmented entities.


Sign in / Sign up

Export Citation Format

Share Document