A New Self-Organizing Map for Dissimilarity Data

Author(s):  
Tien Ho-Phuoc ◽  
Anne Guerin-Dugue

The Self-Organizing Map (Kohonen, 1997) is an effective and a very popular tool for data clustering and visualization. With this method, the input samples are projected into a low dimension space while preserving their topology. The samples are described by a set of features. The input space is generally a high dimensional space Rd. 2D or 3D maps are very often used for visualization in a low dimension space (2 or 3). For many applications, usually in psychology, biology, genetic, image and signal processing, such vector description is not available; only pair-wise dissimilarity data is provided. For instance, applications in Text Mining or ADN exploration are very important in this field and the observations are usually described through their proximities expressed by the “Levenshtein”, or “String Edit” distances (Levenshtein, 1966). The first approach consists of the transformation of a dissimilarity matrix into a true Euclidean distance matrix. A straightforward strategy is to use “Multidimensional Scaling” techniques (Borg & Groenen, 1997) to provide a feature space. So, the initial vector SOM algorithm can be naturally used. If this transformation involves great distortions, the initial vector model for SOM is no longer valid, and the analysis of dissimilarity data requires specific techniques (Jain & Dubes, 1988; Van Cutsem, 1994) and Dissimilarity Self Organizing Map (DSOM) is a new one. Consequently, adaptation of the Self-Organizing Map (SOM) to dissimilarity data is of a growing interest. During this last decade, different propositions emerged to extend the vector SOM model to pair-wise dissimilarity data. The main motivation is to cope with large proximity databases for data mining. In this article, we present a new adaptation of the SOM algorithm which is compared with two existing ones.

1998 ◽  
Vol 10 (4) ◽  
pp. 807-814 ◽  
Author(s):  
Siming Lin ◽  
Jennie Si

Some insights on the convergence of the weight values of the self-organizing map (SOM) to a stationary state in the case of discrete input are provided. The convergence result is obtained by applying the Robbins-Monro algorithm and is applicable to input-output maps of any dimension.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Ratih

Patient Visits Outpatient and inpatient insurance at Class C Hospitals is increasing from year to year. Increased visits to insurance patients will have an impact on the inpatient and outpatient health services provided. From the increase in patient visits, the data owned by the hospital is increasingly abundant. The data can be used to explore knowledge, find certain patterns. To explore knowledge about Inpatient and Outpatient Insurance patients, data mining clustering techniques are used with the Self Organizing Map (SOM) algorithm using R Studio tools. Clustering technique with the implementation of the Self Organizing Map (SOM) algorithm is a technique for grouping data based on certain characteristics which are then mapped into areas that resemble map shapes. The CRISP-DM method is used in this study to perform the stages of the data mining process. The results obtained from the implementation of clustering with the Self Organizing Map (SOM) algorithm are obtained 2 clusters representing dense areas and non-congested areas. Dense areas are represented by Internal Medicine Clinic, Surgery Clinic, Eye Clinic, Hemodialysis, Melati Room, Orchid Room, Bougenville Room, Flamboyan Room. Non-crowded areas are represented by General Clinics, Dental Clinics, Obstetrics and Gynecology Clinics, Children's Clinics, Mawar Room and Soka Room


2011 ◽  
Vol 2011 ◽  
pp. 1-8 ◽  
Author(s):  
Massimo La Rosa ◽  
Riccardo Rizzo ◽  
Alfonso Urso

The Self-Organizing Map (SOM) algorithm is widely used for building topographic maps of data represented in a vectorial space, but it does not operate with dissimilarity data. Soft Topographic Map (STM) algorithm is an extension of SOM to arbitrary distance measures, and it creates a map using a set of units, organized in a rectangular lattice, defining data neighbourhood relationships. In the last years, a new standard for identifying bacteria using genotypic information began to be developed. In this new approach, phylogenetic relationships of bacteria could be determined by comparing a stable part of the bacteria genetic code, the so-called “housekeeping genes.” The goal of this work is to build a topographic representation of bacteria clusters, by means of self-organizing maps, starting from genotypic features regarding housekeeping genes.


2017 ◽  
Vol 31 (29) ◽  
pp. 1750262 ◽  
Author(s):  
Luogeng Chen ◽  
Yanran Wang ◽  
Xiaoming Huang ◽  
Mengyu Hu ◽  
Fang Hu

Currently, community detection is a hot topic. This paper, based on the self-organizing map (SOM) algorithm, introduced the idea of self-adaptation (SA) that the number of communities can be identified automatically, a novel algorithm SA-SOM of detecting communities in complex networks is proposed. Several representative real-world networks and a set of computer-generated networks by LFR-benchmark are utilized to verify the accuracy and the efficiency of this algorithm. The experimental findings demonstrate that this algorithm can identify the communities automatically, accurately and efficiently. Furthermore, this algorithm can also acquire higher values of modularity, NMI and density than the SOM algorithm does.


Sign in / Sign up

Export Citation Format

Share Document