Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps

Recently, many approaches have been devised for mining various kinds of knowledge from texts. One important application of text mining is to identify themes and the semantic relations among these themes for text categorization. Traditionally, these themes were arranged in a hierarchical manner to achieve effective searching and indexing as well as easy comprehension for human beings. The determination of category themes and their hierarchical structures was mostly done by human experts. In this work, we developed an approach to automatically generate category themes and reveal the hierarchical structure among them. We also used the generated structure to categorize text documents. The document collection was trained by a self-organizing map to form two feature maps. We then analyzed these maps and obtained the category themes and their structure. Although the test corpus contains documents written in Chinese, the proposed approach can be applied to documents written in any language, and such documents can be transformed into a list of separated terms.

Download Full-text

Tekstinių dokumentų panašumų paieška naudojant saviorganizuojančius neuroninius tinklus ir k vidurkių metodą

Informacijos mokslai ◽

10.15388/im.2013.0.2058 ◽

2013 ◽

Vol 65 ◽

pp. 24-33

Author(s):

Pavel Stefanovič ◽

Olga Kurasova

Keyword(s):

The Self ◽

Self Organizing Map ◽

Text Documents ◽

Self Organizing Maps ◽

Text Data ◽

Document Database ◽

Cluster A ◽

The Republic ◽

Sum Of Distances ◽

Self Organizing

Straipsnyje nagrinėjama dokumentų panašumų paieška naudojant du populiarius metodus: saviorganizuojančius neuroninius tinklus (SOM) ir k vidurkių metodą. Vienas iš šių metodų tikslų – suskirstyti duomenis į klasterius pagal jų panašumą. Analizuota tekstinių dokumentų matricos sudarymo faktorių įtaka gautiems rezultatams. SOM kokybei įvertinti pasiūlyti du nauji matai, skirti klasifi kuotiems duomenims, kurių reikšmės parodo susidariusių klasterių išsidėstymą SOM žemėlapyje. Pirmasis matas parodo, kaip gerai tos pačios klasės duomenys išsidėsto žemėlapyje vienas šalia kito, antrasis matas – kaip toli yra skirtingų klasių centrai. K vidurkių metodu gautų rezultatų kokybei įvertinti skaičiuota suma nuo klasterio centro iki klasterio narių bei įvertintas klasių nesutapimas su klasteriais. Eksperimentiniams tyrimams atlikti pasirinkti tekstiniai dokumentai, paimti iš Lietuvos Respublikos Seimo dokumentų bazės.Similarity analysis of text documents by self-organizing maps and k-means Pavel Stefanovič, Olga Kurasova SummaryIn this paper, we try to fi nd similarities of different text documents by the self-organizing map (SOM) and k-means method. One of the main goals of these methods is to cluster a dataset. Using SOM, the similarities of documents can be observed visually. Both methods can be used only for numerical information, so we analyse the different options by converting text data on to numerical in order to get better results. To estimate the SOM quality, when the classifi ed data are analysed, we propose two new measures: distances between SOM cells, corresponding to data items assigned to the same class, and the distance between centres of SOM cells, corresponding to different classes. We also analyse the results of visualization by self-organizing maps. In order to estimate the k-means quality, we calculate the sum of distances between cluster centres and class members and also we estimate assignment of the data from particular classes to the clusters. The experiments have been carried out using three datasets ocquired from the document database of Seimas of the Republic of Lithuania.font-family: Calibri, sans-serif;">

Download Full-text

The Spread of the COVID-19 Outbreak in Brazil: An Overview by Kohonen Self-Organizing Map Networks

Medicina ◽

10.3390/medicina57030235 ◽

2021 ◽

Vol 57 (3) ◽

pp. 235

Author(s):

Diego Galvan ◽

Luciane Effting ◽

Hágata Cremasco ◽

Carlos Adam Conte-Junior

Keyword(s):

The Self ◽

Self Organizing Map ◽

Virus Spread ◽

Self Organizing Maps ◽

The North ◽

Som Clustering ◽

Unsupervised Neural Networks ◽

Mining Tools ◽

Temporal Spread ◽

Self Organizing

Background and objective: In the current pandemic scenario, data mining tools are fundamental to evaluate the measures adopted to contain the spread of COVID-19. In this study, unsupervised neural networks of the Self-Organizing Maps (SOM) type were used to assess the spatial and temporal spread of COVID-19 in Brazil, according to the number of cases and deaths in regions, states, and cities. Materials and methods: The SOM applied in this context does not evaluate which measures applied have helped contain the spread of the disease, but these datasets represent the repercussions of the country’s measures, which were implemented to contain the virus’ spread. Results: This approach demonstrated that the spread of the disease in Brazil does not have a standard behavior, changing according to the region, state, or city. The analyses showed that cities and states in the north and northeast regions of the country were the most affected by the disease, with the highest number of cases and deaths registered per 100,000 inhabitants. Conclusions: The SOM clustering was able to spatially group cities, states, and regions according to their coronavirus cases, with similar behavior. Thus, it is possible to benefit from the use of similar strategies to deal with the virus’ spread in these cities, states, and regions.

Download Full-text

Geospatial Analysis of Extreme Weather Events in Nigeria (1985–2015) Using Self-Organizing Maps

Advances in Meteorology ◽

10.1155/2017/8576150 ◽

2017 ◽

Vol 2017 ◽

pp. 1-11 ◽

Cited By ~ 9

Author(s):

Adeoluwa Akande ◽

Ana Cristina Costa ◽

Jorge Mateu ◽

Roberto Henriques

Keyword(s):

Extreme Precipitation ◽

Large Scale ◽

Northern Region ◽

Extreme Weather Events ◽

Self Organizing Map ◽

Spatial And Temporal Patterns ◽

Self Organizing Maps ◽

The North ◽

Precipitation Regimes ◽

Self Organizing

The explosion of data in the information age has provided an opportunity to explore the possibility of characterizing the climate patterns using data mining techniques. Nigeria has a unique tropical climate with two precipitation regimes: low precipitation in the north leading to aridity and desertification and high precipitation in parts of the southwest and southeast leading to large scale flooding. In this research, four indices have been used to characterize the intensity, frequency, and amount of rainfall over Nigeria. A type of Artificial Neural Network called the self-organizing map has been used to reduce the multiplicity of dimensions and produce four unique zones characterizing extreme precipitation conditions in Nigeria. This approach allowed for the assessment of spatial and temporal patterns in extreme precipitation in the last three decades. Precipitation properties in each cluster are discussed. The cluster closest to the Atlantic has high values of precipitation intensity, frequency, and duration, whereas the cluster closest to the Sahara Desert has low values. A significant increasing trend has been observed in the frequency of rainy days at the center of the northern region of Nigeria.

Download Full-text

Dynamic Gesture Recognition System with Gesture Spotting Based on Self-Organizing Maps

Applied Sciences ◽

10.3390/app11041933 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1933

Author(s):

Hiroomi Hikawa ◽

Yuta Ichikawa ◽

Hidetaka Ito ◽

Yutaka Maeda

Keyword(s):

Real Time ◽

Gesture Recognition ◽

Weight Vector ◽

Recognition System ◽

Self Organizing Map ◽

Self Organizing Maps ◽

Time Dynamic ◽

System Input ◽

Self Organizing ◽

Gesture Spotting

In this paper, a real-time dynamic hand gesture recognition system with gesture spotting function is proposed. In the proposed system, input video frames are converted to feature vectors, and they are used to form a posture sequence vector that represents the input gesture. Then, gesture identification and gesture spotting are carried out in the self-organizing map (SOM)-Hebb classifier. The gesture spotting function detects the end of the gesture by using the vector distance between the posture sequence vector and the winner neuron’s weight vector. The proposed gesture recognition method was tested by simulation and real-time gesture recognition experiment. Results revealed that the system could recognize nine types of gesture with an accuracy of 96.6%, and it successfully outputted the recognition result at the end of gesture using the spotting result.

Download Full-text

A New Method for Emulating Self-Organizing Maps for Visualization of Datasets

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026818500141 ◽

2018 ◽

Vol 17 (03) ◽

pp. 1850014 ◽

Cited By ~ 1

Author(s):

Macario O. Cordel ◽

Arnulfo P. Azcarraga

Keyword(s):

Disaster Response ◽

Dimensional Space ◽

Main Idea ◽

Cost Effective ◽

Fine Tuning ◽

Visualization Tool ◽

Self Organizing Map ◽

Text Complexity ◽

Self Organizing Maps ◽

Self Organizing

Several time-critical problems relying on large amount of data, e.g., business trends, disaster response and disease outbreak, require cost-effective, timely and accurate data summary and visualization, in order to come up with an efficient and effective decision. Self-organizing map (SOM) is a very effective data clustering and visualization tool as it provides intuitive display of data in lower-dimensional space. However, with [Formula: see text] complexity, SOM becomes inappropriate for large datasets. In this paper, we propose a force-directed visualization method that emulates SOMs capability to display the data clusters with [Formula: see text] complexity. The main idea is to perform a force-directed fine-tuning of the 2D representation of data. To demonstrate the efficiency and the vast potential of the proposed method as a fast visualization tool, the methodology is used to do a 2D-projection of the MNIST handwritten digits dataset.

Download Full-text

On building early-warning systems for preventing the deterioration of financial institutions’ performance

Proceedings of the International Conference on Applied Statistics ◽

10.2478/icas-2019-0017 ◽

2019 ◽

Vol 1 (1) ◽

pp. 194-202

Author(s):

Adrian Costea

Keyword(s):

Financial Institutions ◽

Early Warning ◽

Warning System ◽

Early Warning Systems ◽

Capital Adequacy ◽

Self Organizing Map ◽

Self Organizing Maps ◽

Performance Dimensions ◽

Som Model ◽

Self Organizing

Abstract This paper assesses the financial performance of Romania’s non-banking financial institutions (NFIs) using a neural network training algorithm proposed by Kohonen, namely the Self-Organizing Maps algorithm. The algorithm takes the financial dataset and positiones each observation into a self-organizing map (a two-dimensional map) which can be latter used to visualize the trajectories of an individual NFI and explain it based on different performance dimensions, such as capital adequacy, assets’ quality and profitability. Further, we use the map as an early-warning system that would accurately forecast the NFIs future performance (whether they would stay or be eliminated from the NFI’s Special Register three quarters into the future). The results are promising: the model is able to correctly predict NFIs’ performance movements. Finally, we compared the results of our SOM-based model with those obtained by applying a multivariate logit-based model. The SOM model performed worse in discriminating the NFIs’ performance: the performance classes were not clearly defined and the model lacked the interpretability of the results. In the contrary, the multivariate logit coefficients have nice interpretability and an individual default probability estimate is obtained for each new observation. However, we can benefit from the results of both techniques: the visualization capabilities of the SOM model and the interpretability of multivariate logit-based model.

Download Full-text

DISCOVERING STOCK TRADING PREFERENCES BY SELF-ORGANIZING MAPS AND DECISION TREES

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213009000299 ◽

2009 ◽

Vol 18 (04) ◽

pp. 603-611 ◽

Cited By ~ 4

Author(s):

CHIH-FONG TSAI ◽

YUAH-CHIAO LIN ◽

YI-TING WANG

Keyword(s):

Individual Investors ◽

Self Organizing Map ◽

Tree Model ◽

Stock Trading ◽

Self Organizing Maps ◽

Two Factors ◽

Group A ◽

Trading Activities ◽

High Level ◽

Self Organizing

Stock trading activities are always very popular in many countries. Generally, investors with various backgrounds have different preferences over the stocks they trade. In literature, a number of studies examine the institutions' holding preferences for certain stock characteristics when choosing the security portfolio. However, very few studies investigate the stock trading preferences of individual investors. In this paper, we focus on two factors which affect the portfolio choices of investors, which are stock characteristics and investor features. In particular, a self-organizing map (SOM) is used to group a certain number of clusters based on a chosen dataset. Then, the decision tree model is used to extract useful rules from the clusters which contain the most trading records in the sample. We find that if the investors are females, less wealthy, and make stock trades with lower frequencies, they will be more careful and conservative. On the other hand, if the investors are males, having a high level of wealth, and make stock trades very often, they tend to choose stocks with high EPS, high market-to-book, and high prices.

Download Full-text

SOM of SOMs: Self-organizing Map Which Maps a Group of Self-organizing Maps

Artificial Neural Networks: Biological Inspirations – ICANN 2005 - Lecture Notes in Computer Science ◽

10.1007/11550822_61 ◽

2005 ◽

pp. 391-396 ◽

Cited By ~ 24

Author(s):

Tetsuo Furukawa

Keyword(s):

Self Organizing Map ◽

Self Organizing Maps ◽

Self Organizing

Download Full-text

Using Self Organizing Maps to Achieve Lithium-Ion Battery Cells Multi-Parameter Sorting Based on Principle Components Analysis

Energies ◽

10.3390/en12152980 ◽

2019 ◽

Vol 12 (15) ◽

pp. 2980 ◽

Cited By ~ 6

Author(s):

Bizhong Xia ◽

Yadi Yang ◽

Jie Zhou ◽

Guanghao Chen ◽

Yifan Liu ◽

...

Keyword(s):

Lithium Ion Battery ◽

Production Line ◽

Lithium Ion ◽

Principal Component ◽

Sorting Algorithm ◽

Self Organizing Map ◽

Self Organizing Maps ◽

Principle Components Analysis ◽

Battery Module ◽

Self Organizing

Battery sorting is an important process in the production of lithium battery module and battery pack for electric vehicles (EVs). Accurate battery sorting can ensure good consistency of batteries for grouping. This study investigates the mechanism of inconsistency of battery packs and process of battery sorting on the lithium-ion battery module production line. Combined with the static and dynamic characteristics of lithium-ion batteries, the battery parameters on the production line that can be used as a sorting basis are analyzed, and the parameters of battery mass, volume, resistance, voltage, charge/discharge capacity and impedance characteristics are measured. The data of batteries are processed by the principal component analysis (PCA) method in statistics, and after analysis, the parameters of batteries are obtained. Principal components are used as sorting variables, and the self-organizing map (SOM) neural network is carried out to cluster the batteries. Group experiments are carried out on the separated batteries, and state of charge (SOC) consistency of the batteries is achieved to verify that the sorting algorithm and sorting result is accurate.

Download Full-text

Self-organizing maps for storage and transfer of knowledge in reinforcement learning

Adaptive Behavior ◽

10.1177/1059712318818568 ◽

2018 ◽

Vol 27 (2) ◽

pp. 111-126 ◽

Cited By ~ 5

Author(s):

Thommen George Karimpanal ◽

Roland Bouffanais

Keyword(s):

Reinforcement Learning ◽

Self Organizing Map ◽

Value Functions ◽

Transfer Of Knowledge ◽

Network Growth ◽

Self Organizing Maps ◽

Task Knowledge ◽

Novel Approach ◽

Learning Agent ◽

Self Organizing

The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In this work, we describe a novel approach for reusing previously acquired knowledge by using it to guide the exploration of an agent while it learns new tasks. In order to do so, we employ a variant of the growing self-organizing map algorithm, which is trained using a measure of similarity that is defined directly in the space of the vectorized representations of the value functions. In addition to enabling transfer across tasks, the resulting map is simultaneously used to enable the efficient storage of previously acquired task knowledge in an adaptive and scalable manner. We empirically validate our approach in a simulated navigation environment and also demonstrate its utility through simple experiments using a mobile micro-robotics platform. In addition, we demonstrate the scalability of this approach and analytically examine its relation to the proposed network growth mechanism. Furthermore, we briefly discuss some of the possible improvements and extensions to this approach, as well as its relevance to real-world scenarios in the context of continual learning.

Download Full-text