Outlier Detection Strategy Using the Self-Organizing Map

Real world datasets are often accompanied with various types of anomalous or exceptional entries which are often referred to as outliers. Detecting outliers and distinguishing noise form true exceptions is important for effective data mining. This chapter presents two methods for outlier detection and analysis using the self-organizing map (SOM), where one is more suitable for categorical and the other for continuous data. They are generally based on filtering out the instances which are not captured by or are contradictory to the obtained concept hierarchy for the domain. We demonstrate how the dimension of the output space plays an important role in the kind of patterns that will be detected as outlying. Furthermore, the concept hierarchy itself provides extra criteria for distinguishing noise from true exceptions. The effectiveness of the proposed outlier detection and analysis strategy is demonstrated through the experiments on publicly available real world datasets.

Download Full-text

Intrusion Detection Based on Self-Organizing Map and Artificial Immunisation Algorithm

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.439-440.29 ◽

2010 ◽

Vol 439-440 ◽

pp. 29-34 ◽

Cited By ~ 1

Author(s):

Zhen Guo Chen ◽

Guang Hua Zhang ◽

Li Qin Tian ◽

Zi Lin Geng

Keyword(s):

Intrusion Detection ◽

Computer Security ◽

Real World ◽

User Behavior ◽

Experimental Result ◽

System Call ◽

Self Organizing Map ◽

System Calls ◽

Real World Datasets ◽

Self Organizing

The rate of false positives which caused by the variability of environment and user behavior limits the applications of intrusion detecting system in real world. Intrusion detection is an important technique in the defense-in-depth network security framework and a hot topic in computer security in recent years. To solve the intrusion detection question, we introduce the self-organizing map and artificial immunisation algorithm into intrusion detection. In this paper, we give an method of rule extraction based on self-organizing map and artificial immunisation algorithm and used in intrusion detection. After illustrating our model with a representative dataset and applying it to the real-world datasets MIT lpr system calls. The experimental result shown that We propose an idea of learning different representations for system call arguments. Results indicate that this information can be effectively used for detecting more attacks with reasonable space and time overhead. So our experiment is feasible and effective that using in intrusion detection.

Download Full-text

SA-SOM algorithm for detecting communities in complex networks

Modern Physics Letters B ◽

10.1142/s0217984917502621 ◽

2017 ◽

Vol 31 (29) ◽

pp. 1750262 ◽

Cited By ~ 2

Author(s):

Luogeng Chen ◽

Yanran Wang ◽

Xiaoming Huang ◽

Mengyu Hu ◽

Fang Hu

Keyword(s):

Complex Networks ◽

Community Detection ◽

Real World ◽

The Self ◽

Self Organizing Map ◽

Som Algorithm ◽

Self Adaptation ◽

Experimental Findings ◽

Novel Algorithm ◽

Self Organizing

Currently, community detection is a hot topic. This paper, based on the self-organizing map (SOM) algorithm, introduced the idea of self-adaptation (SA) that the number of communities can be identified automatically, a novel algorithm SA-SOM of detecting communities in complex networks is proposed. Several representative real-world networks and a set of computer-generated networks by LFR-benchmark are utilized to verify the accuracy and the efficiency of this algorithm. The experimental findings demonstrate that this algorithm can identify the communities automatically, accurately and efficiently. Furthermore, this algorithm can also acquire higher values of modularity, NMI and density than the SOM algorithm does.

Download Full-text

Predicting the risk of exotic pest species establishing in a new area using the self-organizing map, a community ecology-based approach

10.1603/ice.2016.105612 ◽

2016 ◽

Author(s):

Irene Vänninen

Keyword(s):

Community Ecology ◽

The Self ◽

Pest Species ◽

Self Organizing Map ◽

Exotic Pest ◽

Self Organizing

Download Full-text

The Spread of the COVID-19 Outbreak in Brazil: An Overview by Kohonen Self-Organizing Map Networks

Medicina ◽

10.3390/medicina57030235 ◽

2021 ◽

Vol 57 (3) ◽

pp. 235

Author(s):

Diego Galvan ◽

Luciane Effting ◽

Hágata Cremasco ◽

Carlos Adam Conte-Junior

Keyword(s):

The Self ◽

Self Organizing Map ◽

Virus Spread ◽

Self Organizing Maps ◽

The North ◽

Som Clustering ◽

Unsupervised Neural Networks ◽

Mining Tools ◽

Temporal Spread ◽

Self Organizing

Background and objective: In the current pandemic scenario, data mining tools are fundamental to evaluate the measures adopted to contain the spread of COVID-19. In this study, unsupervised neural networks of the Self-Organizing Maps (SOM) type were used to assess the spatial and temporal spread of COVID-19 in Brazil, according to the number of cases and deaths in regions, states, and cities. Materials and methods: The SOM applied in this context does not evaluate which measures applied have helped contain the spread of the disease, but these datasets represent the repercussions of the country’s measures, which were implemented to contain the virus’ spread. Results: This approach demonstrated that the spread of the disease in Brazil does not have a standard behavior, changing according to the region, state, or city. The analyses showed that cities and states in the north and northeast regions of the country were the most affected by the disease, with the highest number of cases and deaths registered per 100,000 inhabitants. Conclusions: The SOM clustering was able to spatially group cities, states, and regions according to their coronavirus cases, with similar behavior. Thus, it is possible to benefit from the use of similar strategies to deal with the virus’ spread in these cities, states, and regions.

Download Full-text

OFCOD: On the Fly Clustering Based Outlier Detection Framework

Data ◽

10.3390/data6010001 ◽

2020 ◽

Vol 6 (1) ◽

pp. 1

Author(s):

Ahmed Elmogy ◽

Hamada Rizk ◽

Amany M. Sarhan

Keyword(s):

Data Mining ◽

Image Processing ◽

Intrusion Detection ◽

Real Time ◽

Outlier Detection ◽

Real World ◽

Medical Data ◽

Experimental Results ◽

Real Time Applications ◽

Real World Datasets

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.

Download Full-text

Clustering and Visualization of Large Protein Sequence Databases by Means of an Extension of the Self-Organizing Map

Discovery Science - Lecture Notes in Computer Science ◽

10.1007/3-540-44418-1_7 ◽

2000 ◽

pp. 76-85 ◽

Cited By ~ 10

Author(s):

Panu Somervuo ◽

Teuvo Kohonen

Keyword(s):

Protein Sequence ◽

The Self ◽

Self Organizing Map ◽

Large Protein ◽

Sequence Databases ◽

Self Organizing

Download Full-text

VISUAL APPROACH TO SUPERVISED VARIABLE SELECTION BY SELF-ORGANIZING MAP

International Journal of Neural Systems ◽

10.1142/s0129065705000098 ◽

2005 ◽

Vol 15 (01n02) ◽

pp. 101-110 ◽

Cited By ~ 1

Author(s):

TIMO SIMILÄ ◽

SAMPSA LAINE

Keyword(s):

Variable Selection ◽

The Self ◽

Data Sets ◽

Self Organizing Map ◽

Robust Method ◽

Relevant Variables ◽

Visual Approach ◽

Predefined Criterion ◽

Target Data ◽

Self Organizing

Practical data analysis often encounters data sets with both relevant and useless variables. Supervised variable selection is the task of selecting the relevant variables based on some predefined criterion. We propose a robust method for this task. The user manually selects a set of target variables and trains a Self-Organizing Map with these data. This sets a criterion to variable selection and is an illustrative description of the user's problem, even for multivariate target data. The user also defines another set of variables that are potentially related to the problem. Our method returns a subset of these variables, which best corresponds to the description provided by the Self-Organizing Map and, thus, agrees with the user's understanding about the problem. The method is conceptually simple and, based on experiments, allows an accessible approach to supervised variable selection.

Download Full-text