Growing Hierarchical Self-Organizing Map Using Category Utility

Author(s):  
Kazushi Murakoshi ◽  
Satoshi Fujikawa

In order to automatically obtain hierarchical knowledge representation from a certain data, an unsupervised learning method has been developed that overcomes two problems of the growing hierarchical self-organizing map (GHSOM) method, which uses the quantization error, the deviation of the input data, as evaluation measure of the growing maps: proper control of the growth process of each map is difficult due to the use of the quantization error and the clusters in the hierarchical structure may be excessively subdivided. This improved GHSOM method uses the category utility (CU), a measure used in conceptual clustering for predicting the preferred level of categorization, instead of the quantization error. The CU is useful for organizing the clustering so that people can effortlessly understand it. The basic principle of this method is that the growth and unification processes are appropriately and autonomously controlled by the CU. Evaluation using computer experiments showed that the proposed method can automatically construct an appropriate hierarchical and topological knowledge representation for high-dimensional input data through unsupervised learning. It also showed that it is easier to use and more effective than the original conventional GHSOM method using the quantization error as an evaluation measure.

Author(s):  
Melody Y. Kiang ◽  
Dorothy M. Fisher ◽  
Michael Y. Hu ◽  
Robert T. Chi

This chapter presents an extended Self-Organizing Map (SOM) network and demonstrates how it can be used to forecast market segment membership. The Kohonen’s SOM network is an unsupervised learning neural network that maps n-dimensional input data to a lower dimensional (usually one- or two-dimensional) output map while maintaining the original topological relations. We apply an extended version of SOM networks that further groups the nodes on the output map into a user-specified number of clusters to a residential market data set from AT&T. Specifically, the extended SOM is used to group survey respondents using their attitudes towards modes of communication. We then compare the extended SOM network solutions with a two-step procedure that uses the factor scores from factor analysis as inputs to K-means cluster analysis. Results using AT&T data indicate that the extended SOM network performs better than the two-step procedure.


Author(s):  
Prashant Tiwari ◽  
SH Upadhyay

The performance degradation assessment of ball bearings is of great importance to increase the efficiency and the reliability of rotating mechanical systems. The large dimensionality of feature space introduces a lot of noise and buries the potential information about faults hidden in the feature data. This paper proposes a novel health assessment method facilitated with two compatible methods, namely curvilinear component analysis and self-organizing map network. The novelty lies in the implementation of a vector quantization approach for the sub-manifolds in the feature space and to extract the fault signatures through nonlinear mapping technique. Curvilinear component analysis is a nonlinear mapping tool that can effectively represent the average manifold of the highly folded information and further preserves the local topology of the data. To answer the complications and to accomplish reliability and accuracy in bearing performance degradation assessment, the work is carried out with following steps; first, ensemble empirical mode decomposition is used to decompose the vibration signals into useful intrinsic mode functions; second, two fault features i.e. singular values and energy entropies are extracted from the envelopes of the intrinsic mode function signals; third, the extracted feature vectors under healthy conditions, further reduced with curvilinear component analysis are used to train the self-organizing map model; finally, the reduced test feature vectors are supplied to the trained self-organizing map and the confidence value is obtained. The effectiveness of the proposed technique is validated on three run-to-failure test signals with the different type of defects. The results indicate that the proposed technique detects the weak degradation earlier than the widely used indicators such as root mean square, kurtosis, self-organizing map-based minimum quantization error, and minimum quantization error-based on the principal component analysis.


Author(s):  
Ambarwati Ambarwati ◽  
Edi Winarko

AbstrakBerita merupakan sumber informasi yang dinantikan oleh manusia setiap harinya. Manusia membaca berita dengan kategori yang diinginkan. Jika komputer mampu mengelompokkan berita secara otomatis maka tentunya manusia akan lebih mudah membaca berita sesuai dengan kategori yang diinginkan. Pengelompokan berita yang berupa artikel secara otomatis sangatlah menarik karena mengorganisir artikel berita secara manual membutuhkan waktu dan biaya yang tidak sedikit.Tujuan penelitian ini adalah membuat sistem aplikasi untuk pengelompokkan artikel berita dengan menggunakan algoritma Self Organizing Map. Artikel berita digunakan sebagai input data. Kemudian sistem melakukan pemrosesan data untuk dikelompokkan. Proses yang dilakukan sistem meliputi preprocessing, feature extraction, clustering dan visualize.Sistem yang dikembangkan mampu menampilkan hasil clustering dengan algoritma Self Organizing Map dan memberikan visualisasi dengan smoothed data histograms berupa island map dari artikel berita. Selain itu sistem dapat menampilkan koleksi dokumen dari lima kategori berita yang ada pada tiap tahunnya dan banyaknya kata (histogram kata) yang sering muncul pada tiap arikel berita. Pengujian dari sistem ini dengan memasukan artikel berita, kemudian sistem memprosesnya dan mampu memberikan hasil cluster dari artikel berita yang dimasukan. Kata kunci—Pengelompokkan berita Indonesia, pengelompokkan berdasar histogram kata, pengelompokan berita menggunakan SOM  Abstract News is awaited information resources by humans every day. Human reading the news with the desired category. If the computer able to news clustering with automatically, humans of course will be easier to read the news according to the desired category. News clustering in the form of news articles with automatically very interesting because it organizes news articles manually takes time and costs not a little bit.The purpose of this research is to create a system application for grouping news articles by using the Self Organizing Map algorithm. News article be used as input into the system. News articles used as input data. Then the system performs data processing until to be clustered. Processes performed by the system covers: preprocessing, feature extraction, clustering and visualize.The system developed is able to display the results clustering of the Self Organizing Map algorithm and gives visualization of the Smoothed Data Histograms in the form of island map from news articles. Additionally the system can display a word histogram and news articles from five categories news in each year. Testing of this system by entering the news articles, then the system performs data processing and gives results of a cluster from news articles that input. Keywords—Indonesia news clustering, clustering based on words histograms, news clustering using SOM


Data Mining ◽  
2011 ◽  
pp. 199-219 ◽  
Author(s):  
Hsin-Chang Yang ◽  
Chung-Hong Lee

Recently, many approaches have been devised for mining various kinds of knowledge from texts. One important application of text mining is to identify themes and the semantic relations among these themes for text categorization. Traditionally, these themes were arranged in a hierarchical manner to achieve effective searching and indexing as well as easy comprehension for human beings. The determination of category themes and their hierarchical structures was mostly done by human experts. In this work, we developed an approach to automatically generate category themes and reveal the hierarchical structure among them. We also used the generated structure to categorize text documents. The document collection was trained by a self-organizing map to form two feature maps. We then analyzed these maps and obtained the category themes and their structure. Although the test corpus contains documents written in Chinese, the proposed approach can be applied to documents written in any language, and such documents can be transformed into a list of separated terms.


2020 ◽  
Vol 6 (3) ◽  
pp. 65-70
Author(s):  
R. FARAH DINI QOYYIMAH ◽  
Erfan Rohadi, ST., M. Eng., Ph.D ◽  
Rizky Ardiansyah, S.Kom, MT

Infrastruktur dan sistem informasi merupakan sumber daya manusia yang membantu pemerintah dalam mewujudkan dan pemberdayaan masyarakat baik secara ekonomi maupun kepuasan publik. Tidak terkecuali yang dilakukan pada Dinas Komunikasi dan Informatika Pemerintah Kota Probolinggo. Dalam meningkatkan kualitas pengembangangan infrastruktur secara lebih terkoordinir maka dibuatlah sistem informasi berbasis pemetaan infrastruktur dan sistem informasi dengan menggunakan algoritma clustering SOM. Self Organizing Map (SOM) merupakan salah satu metode dalam Jaringan Syaraf Tiruan (Neural Network) yang menggunakan pembelajaran tanpa pengarahan (Unsupervised Learning). Penelitian ini menghasilkan sebuah website yang memberikan informasi kepada user atau pengguna yang merupakan pihak pemerintahan Dinas Kominfo Kota Probolinggo dalam mengevaluasi perkembangan dan pemerataan infrastruktur dan sistem informasi. Dari hasil perhitungan menggunakan metode Self -Organizing Map dapat diterapkan dalam clustering untuk pemerataan infrastruktur IT yang menghasilkan 3 cluster yang terdiri dari cluster 1 yang memiliki persebaran infrastruktur yang baik berjumlah 1 wilayah, cluster 2 yang memiliki persebaran infrastruktur yang cukup baik berjumlah 23 wilayah dan cluster 3 yang memiliki persebaran infrastrukttur yang kurang baik berjumlah 5 wilayah. Sehingga dapat diketahui pemerataan IT di Kota Probolingo dapat dinilai cukup baik. 4. Berdasarkan pengujian diperoleh hasil akurasi hasil cluster yang baik dengan menggunakan Self-Organizing Map sebanyak 62.06897%. Kata kunci : Clustering, Self Organizing Map (SOM)


Sign in / Sign up

Export Citation Format

Share Document