HDGSOMr: A High Dimensional Growing Self-Organizing Map Using Randomness for Efficient Web and Text Mining

Author(s):  
R. Amarasiri ◽  
D. Alahakoon ◽  
K. Smith ◽  
M. Premaratne
2014 ◽  
Vol 41 (3) ◽  
pp. 341-355 ◽  
Author(s):  
Yi Xiao ◽  
Rui-Bin Feng ◽  
Zi-Fa Han ◽  
Chi-Sing Leung

2020 ◽  
Vol 92 (15) ◽  
pp. 10450-10459 ◽  
Author(s):  
Wil Gardner ◽  
Ruqaya Maliki ◽  
Suzanne M. Cutts ◽  
Benjamin W. Muir ◽  
Davide Ballabio ◽  
...  

2014 ◽  
Vol 13 (02) ◽  
pp. 387-406 ◽  
Author(s):  
Hsin-Chang Yang ◽  
Chung-Hong Lee

Social bookmarking Websites are popular nowadays for they provide platforms that are easy and clear to browse and organize Web pages. Users can add tags on Web pages to allow easy comprehension and retrieval of Web pages. However, tag spams could also be added to promote the opportunity of being referenced of a Web page, which is troublesome to users for accessing uninterested Web pages. In this work, we proposed a scheme to automatically detect such tag spams using a proposed text mining approach based on self-organizing map (SOM) model. We used SOM to find the associations among Web pages as well as tags. Such associations were then used to discover the relationships between Web pages and tags. Tag spams can then be detected according to such relationships. Experiments were conducted on a set of Web pages collected from a social bookmarking site and obtained promising result.


2005 ◽  
Vol 4 (1) ◽  
pp. 22-31 ◽  
Author(s):  
Timo Similä

One of the main tasks in exploratory data analysis is to create an appropriate representation for complex data. In this paper, the problem of creating a representation for observations lying on a low-dimensional manifold embedded in high-dimensional coordinates is considered. We propose a modification of the Self-organizing map (SOM) algorithm that is able to learn the manifold structure in the high-dimensional observation coordinates. Any manifold learning algorithm may be incorporated to the proposed training strategy to guide the map onto the manifold surface instead of becoming trapped in local minima. In this paper, the Locally linear embedding algorithm is adopted. We use the proposed method successfully on several data sets with manifold geometry including an illustrative example of a surface as well as image data. We also show with other experiments that the advantage of the method over the basic SOM is restricted to this specific type of data.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Wenqi Hua ◽  
Lingfei Mo

This paper proposes a clustering ensemble method that introduces cascade structure into the self-organizing map (SOM) to solve the problem of the poor performance of a single clusterer. Cascaded SOM is an extension of classical SOM combined with the cascaded structure. The method combines the outputs of multiple SOM networks in a cascaded manner using them as an input to another SOM network. It also utilizes the characteristic of high-dimensional data insensitivity to changes in the values of a small number of dimensions to achieve the effect of ignoring part of the SOM network error output. Since the initial parameters of the SOM network and the sample training order are randomly generated, the model does not need to provide different training samples for each SOM network to generate a differentiated SOM clusterer. After testing on several classical datasets, the experimental results show that the model can effectively improve the accuracy of pattern recognition by 4%∼10%.


2021 ◽  
pp. 1-33
Author(s):  
Nicolas P. Rougier ◽  
Georgios Is. Detorakis

Abstract We propose a variation of the self-organizing map algorithm by considering the random placement of neurons on a two-dimensional manifold, following a blue noise distribution from which various topologies can be derived. These topologies possess random (but controllable) discontinuities that allow for a more flexible self- organization, especially with high-dimensional data. The proposed algorithm is tested on one-, two- and three-dimensional tasks, as well as on the MNIST handwritten digits data set and validated using spectral analysis and topological data analysis tools. We also demonstrate the ability of the randomized self-organizing map to gracefully reorganize itself in case of neural lesion and/or neurogenesis.


Sign in / Sign up

Export Citation Format

Share Document