A Web text mining approach based on self-organizing map

Author(s):  
Chung-Hong Lee ◽  
Hsin-Chang Yang
2014 ◽  
Vol 13 (02) ◽  
pp. 387-406 ◽  
Author(s):  
Hsin-Chang Yang ◽  
Chung-Hong Lee

Social bookmarking Websites are popular nowadays for they provide platforms that are easy and clear to browse and organize Web pages. Users can add tags on Web pages to allow easy comprehension and retrieval of Web pages. However, tag spams could also be added to promote the opportunity of being referenced of a Web page, which is troublesome to users for accessing uninterested Web pages. In this work, we proposed a scheme to automatically detect such tag spams using a proposed text mining approach based on self-organizing map (SOM) model. We used SOM to find the associations among Web pages as well as tags. Such associations were then used to discover the relationships between Web pages and tags. Tag spams can then be detected according to such relationships. Experiments were conducted on a set of Web pages collected from a social bookmarking site and obtained promising result.


2012 ◽  
Vol 430-432 ◽  
pp. 1232-1235
Author(s):  
Yi Ding ◽  
Xian Fu

Web text mining is a new issue in the knowledge discovery research field. It is aimed to help people discover knowledge from large quantities of semi-structured or unstructured text in the web. Several approaches, including some pure and hybrid information retrieval (IR) methods, have been proposed to tackle such an issue. Among these approaches, combining the Self-Organizing Map (SOM) method with the principles of the vector-space model, appears to be a promising alternative for the traditional purely IR-based methods in this problem domain. The encoded documents are organized on another self-organizing map, a document map, on which nearby locations contain similar documents. Special consideration is given to the computation of very large document maps which is possible with general-purpose computers if the dimensionality of the word category histograms is first reduced with a random mapping method and if computationally efficient algorithms are used in computing the SOMs.


2012 ◽  
Vol 132 (10) ◽  
pp. 1589-1594 ◽  
Author(s):  
Hayato Waki ◽  
Yutaka Suzuki ◽  
Osamu Sakata ◽  
Mizuya Fukasawa ◽  
Hatsuhiro Kato

2011 ◽  
Vol 131 (1) ◽  
pp. 160-166 ◽  
Author(s):  
Yutaka Suzuki ◽  
Mizuya Fukasawa ◽  
Osamu Sakata ◽  
Hatsuhiro Kato ◽  
Asobu Hattori ◽  
...  

2018 ◽  
Vol 9 (3) ◽  
pp. 209-221 ◽  
Author(s):  
Seung-Yoon Back ◽  
Sang-Wook Kim ◽  
Myung-Il Jung ◽  
Joon-Woo Roh ◽  
Seok-Woo Son

Sign in / Sign up

Export Citation Format

Share Document