Probabilistic Self-Organizing Map for Clustering and Visualizing non-i.i.d Data

Author(s):  
Mustapha Lebbah ◽  
Rakia Jaziri ◽  
Younès Bennani ◽  
Jean-Hugues Chenot

We present a generative approach to train a new probabilistic self-organizing map (PrSOMS) for dependent and nonidentically distributed data sets. Our model defines a low-dimensional manifold allowing friendly visualizations. To yield the topology preserving maps, our model has the SOM like learning behavior with the advantages of probabilistic models. This new paradigm uses hidden Markov models (HMM) formalism and introduces relationships between the states. This allows us to take advantage of all the known classical views associated to topographic map. The objective function optimization has a clear interpretation, which allows us to propose expectation-maximization (EM) algorithm, based on the forward–backward algorithm, to train the model. We demonstrate our approach on two data sets: The real-world data issued from the "French National Audiovisual Institute" and handwriting data captured using a WACOM tablet.

2005 ◽  
Vol 15 (01n02) ◽  
pp. 101-110 ◽  
Author(s):  
TIMO SIMILÄ ◽  
SAMPSA LAINE

Practical data analysis often encounters data sets with both relevant and useless variables. Supervised variable selection is the task of selecting the relevant variables based on some predefined criterion. We propose a robust method for this task. The user manually selects a set of target variables and trains a Self-Organizing Map with these data. This sets a criterion to variable selection and is an illustrative description of the user's problem, even for multivariate target data. The user also defines another set of variables that are potentially related to the problem. Our method returns a subset of these variables, which best corresponds to the description provided by the Self-Organizing Map and, thus, agrees with the user's understanding about the problem. The method is conceptually simple and, based on experiments, allows an accessible approach to supervised variable selection.


2011 ◽  
pp. 24-32 ◽  
Author(s):  
Nicoleta Rogovschi ◽  
Mustapha Lebbah ◽  
Younès Bennani

Most traditional clustering algorithms are limited to handle data sets that contain either continuous or categorical variables. However data sets with mixed types of variables are commonly used in data mining field. In this paper we introduce a weighted self-organizing map for clustering, analysis and visualization mixed data (continuous/binary). The learning of weights and prototypes is done in a simultaneous manner assuring an optimized data clustering. More variables has a high weight, more the clustering algorithm will take into account the informations transmitted by these variables. The learning of these topological maps is combined with a weighting process of different variables by computing weights which influence the quality of clustering. We illustrate the power of this method with data sets taken from a public data set repository: a handwritten digit data set, Zoo data set and other three mixed data sets. The results show a good quality of the topological ordering and homogenous clustering.


2016 ◽  
pp. 203-214 ◽  
Author(s):  
Ahmad Al-Khasawneh

Breast cancer is the second leading cause of cancer deaths in women worldwide. Early diagnosis of this illness can increase the chances of long-term survival of cancerous patients. To help in this aid, computerized breast cancer diagnosis systems are being developed. Machine learning algorithms and data mining techniques play a central role in the diagnosis. This paper describes neural network based approaches to breast cancer diagnosis. The aim of this research is to investigate and compare the performance of supervised and unsupervised neural networks in diagnosing breast cancer. A multilayer perceptron has been implemented as a supervised neural network and a self-organizing map as an unsupervised one. Both models were simulated using a variety of parameters and tested using several combinations of those parameters in independent experiments. It was concluded that the multilayer perceptron neural network outperforms Kohonen's self-organizing maps in diagnosing breast cancer even with small data sets.


2016 ◽  
Vol 78 (6-13) ◽  
Author(s):  
Azlin Ahmad ◽  
Rubiyah Yusof

The Kohonen Self-Organizing Map (KSOM) is one of the Neural Network unsupervised learning algorithms. This algorithm is used in solving problems in various areas, especially in clustering complex data sets. Despite its advantages, the KSOM algorithm has a few drawbacks; such as overlapped cluster and non-linear separable problems. Therefore, this paper proposes a modified KSOM that inspired from pheromone approach in Ant Colony Optimization. The modification is focusing on the distance calculation amongst objects. The proposed algorithm has been tested on four real categorical data that are obtained from UCI machine learning repository; Iris, Seeds, Glass and Wisconsin Breast Cancer Database. From the results, it shows that the modified KSOM has produced accurate clustering result and all clusters can clearly be identified.


Author(s):  
MUSTAPHA LEBBAH ◽  
YOUNÈS BENNANI ◽  
NICOLETA ROGOVSCHI

This paper introduces a probabilistic self-organizing map for topographic clustering, analysis and visualization of multivariate binary data or categorical data using binary coding. We propose a probabilistic formalism dedicated to binary data in which cells are represented by a Bernoulli distribution. Each cell is characterized by a prototype with the same binary coding as used in the data space and the probability of being different from this prototype. The learning algorithm, Bernoulli on self-organizing map, that we propose is an application of the EM standard algorithm. We illustrate the power of this method with six data sets taken from a public data set repository. The results show a good quality of the topological ordering and homogenous clustering.


2005 ◽  
Vol 4 (1) ◽  
pp. 22-31 ◽  
Author(s):  
Timo Similä

One of the main tasks in exploratory data analysis is to create an appropriate representation for complex data. In this paper, the problem of creating a representation for observations lying on a low-dimensional manifold embedded in high-dimensional coordinates is considered. We propose a modification of the Self-organizing map (SOM) algorithm that is able to learn the manifold structure in the high-dimensional observation coordinates. Any manifold learning algorithm may be incorporated to the proposed training strategy to guide the map onto the manifold surface instead of becoming trapped in local minima. In this paper, the Locally linear embedding algorithm is adopted. We use the proposed method successfully on several data sets with manifold geometry including an illustrative example of a surface as well as image data. We also show with other experiments that the advantage of the method over the basic SOM is restricted to this specific type of data.


2014 ◽  
pp. 68-75
Author(s):  
Oles Hodych ◽  
Yuriy Shcherbyna ◽  
Michael Zylan

In this article the authors propose an approach to forecasting the direction of the share price fluctuation, which is based on utilization of the Feedforward Neural Network in conjunction with Self-Organizing Map. It is proposed to use the Self-Organizing Map for filtration of the share price data set, whereas the Feedforward Neural Network is used to forecast the direction of the share price fluctuation based on the filtered data set. The comparison results are presented for filtered and non-filtered share price data sets.


2021 ◽  
Author(s):  
Artur Oliva Gonsales

In this work, a new approach to gesture recognition using the properties of Spherical Self- Organizing Map (SSOM) is investigated. Bounded mapping of data onto a SSOM creates not only a powerful tool for visualization but also for modeling spatiotemporal information of gesture data. The SSOM allows for the automated decomposition of a variety of gestures into a set of distinct postures. The decomposition naturally organizes this set into a spatial map that preserves associations between postures, upon which we formalize the notion of a gesture as a trajectory through learned posture space. Trajectories from different gestures may share postures. However, the path traversed through posture space is relatively unique. Different variations of posture transitions occurring within a gesture trajectory are used to classify new unknown gestures. Four mechanisms for detecting the occurrence of a trajectory of an unknown gesture are proposed and evaluated on two data sets involving both hand gestures (public sign language database) and full body gestures (Microsoft Kinect database collected in-house) showing the effectiveness of the proposed approach.


2021 ◽  
pp. 1-33
Author(s):  
Nicolas P. Rougier ◽  
Georgios Is. Detorakis

Abstract We propose a variation of the self-organizing map algorithm by considering the random placement of neurons on a two-dimensional manifold, following a blue noise distribution from which various topologies can be derived. These topologies possess random (but controllable) discontinuities that allow for a more flexible self- organization, especially with high-dimensional data. The proposed algorithm is tested on one-, two- and three-dimensional tasks, as well as on the MNIST handwritten digits data set and validated using spectral analysis and topological data analysis tools. We also demonstrate the ability of the randomized self-organizing map to gracefully reorganize itself in case of neural lesion and/or neurogenesis.


Sign in / Sign up

Export Citation Format

Share Document