On the Evaluation of Outlier Detection and One-Class Classification Methods

Author(s):  
Lorne Swersky ◽  
Henrique O. Marques ◽  
Joerg Sander ◽  
Ricardo J.G.B. Campello ◽  
Arthur Zimek

Author(s):  
Wenjuan An ◽  
Mangui Liang ◽  
He Liu

Outlier detection, as a type of one-class classification problem, is one of important research topics in data mining and machine learning. Its task is to identify sample points markedly deviating from the normal data. A reliable outlier detector needs to build a model which encloses the normal data tightly. In this paper, an improved one-class SVM (OC-SVM) classifier is proposed for outlier detection problems. We name this method OC-SVM with minimum within-class scatter (OC-WCSSVM), which exploits the inner-class structure of the training set via minimizing the within-class scatter of the training data. This can construct a more accurate hyperplane for outlier detection, such that the margin between the training data and the origin in a higher dimensional space is as large as possible, while at the same time the decision boundary around the normal data is as tight as possible. Experimental results on a synthetic dataset and 10 real-world datasets demonstrate that our proposed OC-WCSSVM algorithm is effective and superior to the compared algorithms.



2007 ◽  
Vol Volume 6, april 2007, joint... ◽  
Author(s):  
Oleksiy Mazhelis

International audience One-class classifiers employing for training only the data from one class are justified when the data from other classes is difficult to obtain. In particular, their use is justified in mobile-masquerader detection, where user characteristics are classified as belonging to the legitimate user class or to the impostor class, and where collecting the data originated from impostors is problematic. This paper systematically reviews various one-class classification methods, and analyses their suitability in the context of mobile-masquerader detection. For each classification method, its sensitivity to the errors in the training set, computational requirements, and other characteristics are considered. After that, for each category of features used in masquerader detection, suitable classifiers are identified.



2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Naeem Seliya ◽  
Azadeh Abdollah Zadeh ◽  
Taghi M. Khoshgoftaar

AbstractIn severely imbalanced datasets, using traditional binary or multi-class classification typically leads to bias towards the class(es) with the much larger number of instances. Under such conditions, modeling and detecting instances of the minority class is very difficult. One-class classification (OCC) is an approach to detect abnormal data points compared to the instances of the known class and can serve to address issues related to severely imbalanced datasets, which are especially very common in big data. We present a detailed survey of OCC-related literature works published over the last decade, approximately. We group the different works into three categories: outlier detection, novelty detection, and deep learning and OCC. We closely examine and evaluate selected works on OCC such that a good cross section of approaches, methods, and application domains is represented in the survey. Commonly used techniques in OCC for outlier detection and for novelty detection, respectively, are discussed. We observed one area that has been largely omitted in OCC-related literature is its application context for big data and its inherently associated problems, such as severe class imbalance, class rarity, noisy data, feature selection, and data reduction. We feel the survey will be appreciated by researchers working in these areas of big data.



Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
P. S. Szczepaniak ◽  
A. Duraj

The present paper applies the case-based reasoning (CBR) technique to the problem of outlier detection. Although CBR is a widely investigated method with a variety of successful applications in the academic domain, so far, it has not been explored from an outlier detection perspective. This study seeks to address this research gap by defining the outlier case and the underlining specificity of the outlier detection process within the CBR approach. Moreover, the case-based classification (CBC) method is discussed as a task type of CBR. This is followed by the computational illustration of the approach using selected classification methods, that is, linear regression, distance-based classifier, and the Bayes classifier.







Sign in / Sign up

Export Citation Format

Share Document