A Web text mining approach based on self-organizing map

2014 ◽

Vol 13 (02) ◽

pp. 387-406 ◽

Cited By ~ 3

Author(s):

Hsin-Chang Yang ◽

Chung-Hong Lee

Keyword(s):

Text Mining ◽

Promising Result ◽

Web Pages ◽

Self Organizing Map ◽

Web Page ◽

Social Bookmarking ◽

Som Model ◽

Self Organizing

Social bookmarking Websites are popular nowadays for they provide platforms that are easy and clear to browse and organize Web pages. Users can add tags on Web pages to allow easy comprehension and retrieval of Web pages. However, tag spams could also be added to promote the opportunity of being referenced of a Web page, which is troublesome to users for accessing uninterested Web pages. In this work, we proposed a scheme to automatically detect such tag spams using a proposed text mining approach based on self-organizing map (SOM) model. We used SOM to find the associations among Web pages as well as tags. Such associations were then used to discover the relationships between Web pages and tags. Tag spams can then be detected according to such relationships. Experiments were conducted on a set of Web pages collected from a social bookmarking site and obtained promising result.

Download Full-text

HDGSOMr: A High Dimensional Growing Self-Organizing Map Using Randomness for Efficient Web and Text Mining

The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05) ◽

10.1109/wi.2005.70 ◽

2005 ◽

Cited By ~ 10

Author(s):

R. Amarasiri ◽

D. Alahakoon ◽

K. Smith ◽

M. Premaratne

Keyword(s):

Text Mining ◽

High Dimensional ◽

Self Organizing Map ◽

Self Organizing

Download Full-text

The Research of Self-Organizing Maps Based on Document Collections

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.430-432.1232 ◽

2012 ◽

Vol 430-432 ◽

pp. 1232-1235

Author(s):

Yi Ding ◽

Xian Fu

Keyword(s):

Mapping Method ◽

General Purpose ◽

Research Field ◽

Self Organizing Map ◽

Computationally Efficient ◽

Self Organizing Maps ◽

Promising Alternative ◽

Discovery Research ◽

Web Text Mining ◽

Self Organizing

Web text mining is a new issue in the knowledge discovery research field. It is aimed to help people discover knowledge from large quantities of semi-structured or unstructured text in the web. Several approaches, including some pure and hybrid information retrieval (IR) methods, have been proposed to tackle such an issue. Among these approaches, combining the Self-Organizing Map (SOM) method with the principles of the vector-space model, appears to be a promising alternative for the traditional purely IR-based methods in this problem domain. The encoded documents are organized on another self-organizing map, a document map, on which nearby locations contain similar documents. Special consideration is given to the computation of very large document maps which is possible with general-purpose computers if the dimensionality of the word category histograms is first reduced with a random mapping method and if computationally efficient algorithms are used in computing the SOMs.

Download Full-text