FINDING REPRESENTATIVE WEB PAGES BASED ON A SOM AND A REVERSE CLUSTER ANALYSIS

Enhancing the content and structure of a web site is a very important task which can help to maintain people visiting a web site and gain new visits (or customers). Web mining area helps to enhance a web site organization and contents using data mining algorithms. In particular we may perform Web Mining using a Self Organizing Feature Map (SOFM or SOM) it is always needed an analysis phase by experts. To help analysts to perform this phase after SOFMs' training, many post-processing techniques have been developed (component planes, labels, etc.); however, none of these techniques are useful when working in web mining for off-line enhancements of a web site. In this paper an algorithm called Reverse Cluster Analysis (RCA) will be provided. It aims to identify important web pages based on a self organizing feature map (SOFM) when performing web text mining (WTM) and web usage mining (WUM). We successfully applied this technique in a real web site to show its effectiveness. We have extended previous work performing a comparison with another unsupervised technique, administrators survey and an extended survey.

Download Full-text

Using a Self Organizing Feature Map for Extracting Representative Web Pages from a Web Site

International Journal of Computational Intelligence Research ◽

10.5019/j.ijcir.2006.59 ◽

2006 ◽

Vol 2 (2) ◽

Cited By ~ 2

Author(s):

Sebastián Ríos ◽

Juan D. Velázquez ◽

Hiroshi Yasuda ◽

Terumasa Aoki

Keyword(s):

Web Site ◽

Web Pages ◽

Feature Map ◽

Self Organizing

Download Full-text

A Model for Extracting Most Desired Web Pages

Transforming Businesses With Bitcoin Mining and Blockchain Applications - Advances in Finance, Accounting, and Economics ◽

10.4018/978-1-7998-0186-3.ch007 ◽

2020 ◽

pp. 119-145

Author(s):

Jayanti Mehra ◽

Ramjeevan Singh Thakur

Keyword(s):

Data Mining ◽

Path Length ◽

Web Mining ◽

Statistical Information ◽

Web Pages ◽

Systems Practice ◽

Clustering And Classification ◽

Using Data ◽

Access Logs ◽

Manipulation Process

Weblog analysis takes raw data from access logs and performs study on this data for extracting statistical information. This info incorporates a variety of data for the website activity such as average no. of hits, total no. of user visits, failed and successful cached hits, average time of view, average path length over a website; analytical information such as page was not found errors and server errors; server information, which includes exit and entry pages, single access pages, and top visited pages; requester information like which type of search engines is used, keywords and top referring sites, and so on. In general, the website administrator uses this kind of knowledge to make the system act better, helping in the manipulation process of site, then also forgiving marketing decisions support. Most of the advanced web mining systems practice this kind of information to take out more difficult or complex interpretations using data mining procedures like association rules, clustering, and classification.

Download Full-text

BUILDING A KNOWLEDGE BASE FOR IMPLEMENTING A WEB-BASED COMPUTERIZED RECOMMENDATION SYSTEM

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213007003552 ◽

2007 ◽

Vol 16 (05) ◽

pp. 793-828 ◽

Cited By ~ 10

Author(s):

JUAN D. VELÁSQUEZ ◽

VASILE PALADE

Keyword(s):

Knowledge Base ◽

Web Site ◽

Web Mining ◽

Recommendation System ◽

The Internet ◽

Web Pages ◽

Web Based ◽

Web Logs ◽

Mining Tools ◽

The Web

Understanding the web user browsing behaviour in order to adapt a web site to the needs of a particular user represents a key issue for many commercial companies that do their business over the Internet. This paper presents the implementation of a Knowledge Base (KB) for building web-based computerized recommender systems. The Knowledge Base consists of a Pattern Repository that contains patterns extracted from web logs and web pages, by applying various web mining tools, and a Rule Repository containing rules that describe the use of discovered patterns for building navigation or web site modification recommendations. The paper also focuses on testing the effectiveness of the proposed online and offline recommendations. An ample real-world experiment is carried out on a web site of a bank.

Download Full-text

Water trophicity of Utricularia microhabitats identlfied by means of SOFM as a tool in ecological modeling

Acta Societatis Botanicorum Poloniae ◽

10.5586/asbp.2007.029 ◽

2011 ◽

Vol 76 (3) ◽

pp. 255-261 ◽

Cited By ~ 2

Author(s):

Piotr Kosiba ◽

Andrzej Stankiewicz

Keyword(s):

Water Quality ◽

Cluster Analysis ◽

Ecological Modeling ◽

The Self ◽

Topological Map ◽

Upper Silesia ◽

Hierarchical Tree ◽

Feature Map ◽

Self Organizing

The study objects were 48 microhabitats of five Utricularia species in Lower and Upper Silesia (POLAND). The aim of the paper was to focus on application of the Self-Organizing Feature Map in assessment of water trophicity in Utricularia microhabitats, and to describe how SOFM can be used for the study of ecological subjects. This method was compared with the hierarchical tree plot of cluster analysis to check whether this techniques give similar results. In effect, both topological map of SOFM and dendrogram of cluster analysis show differences between Utricularia species microhabitats in respect of water quality, from eutrophic for U. vulgaris to dystrophic for U. minor and U. intermedia. The used methods give similar results and constitute a validation of the SOFM method in this type of studies.

Download Full-text

Web Pages Classification with Parliamentary Optimization Algorithm

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194017500188 ◽

2017 ◽

Vol 27 (03) ◽

pp. 499-513 ◽

Cited By ~ 6

Author(s):

Soner Kiziloluk ◽

Ahmet Bedri Ozer

Keyword(s):

Optimization Algorithm ◽

Web Mining ◽

Web Pages ◽

Data Sets ◽

Web Page ◽

Web Documents ◽

Web Page Classification ◽

Using Data ◽

Extract Information ◽

Page Classification

In recent years, data on the Internet has grown exponentially, attaining enormous dimensions. This situation makes it difficult to obtain useful information from such data. Web mining is the process of using data mining techniques such as association rules, classification, clustering, and statistics to discover and extract information from Web documents. Optimization algorithms play an important role in such techniques. In this work, the parliamentary optimization algorithm (POA), which is one of the latest social-based metaheuristic algorithms, has been adopted for Web page classification. Two different data sets (Course and Student) were selected for experimental evaluation, and HTML tags were used as features. The data sets were tested using different classification algorithms implemented in WEKA, and the results were compared with those of the POA. The POA was found to yield promising results compared to the other algorithms. This study is the first to propose the POA for effective Web page classification.

Download Full-text

Self-organizing feature map for cluster analysis in multi-disease diagnosis

Expert Systems with Applications ◽

10.1016/j.eswa.2010.02.084 ◽

2010 ◽

Vol 37 (9) ◽

pp. 6359-6367 ◽

Cited By ~ 15

Author(s):

Ke Zhang ◽

Yi Chai ◽

Simon X. Yang

Keyword(s):

Cluster Analysis ◽

Disease Diagnosis ◽

Feature Map ◽

Self Organizing

Download Full-text

Genetic-algorithms-based approach to self-organizing feature map and its application in cluster analysis

1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227) ◽

10.1109/ijcnn.1998.682372 ◽

2002 ◽

Cited By ~ 1

Author(s):

Mu-Chun Su ◽

Hsiao-Te Chang

Keyword(s):

Cluster Analysis ◽

Genetic Algorithms ◽

Feature Map ◽

Self Organizing

Download Full-text

Web Mining System for Mobile-Phone Marketing

Mobile Computing ◽

10.4018/978-1-60566-054-7.ch220 ◽

2009 ◽

pp. 2924-2935

Author(s):

Miao-Ling Wang ◽

Hsiao-Fan Wang

Keyword(s):

Information Retrieval ◽

Mobile Phone ◽

Mobile Phones ◽

Web Site ◽

Text Categorization ◽

Web Mining ◽

Mining System ◽

Retrieval Technique ◽

Web Text Mining ◽

The Web

With the ever-increasing and ever-changing flow of information available on the Web, information analysis has never been more important. Web text mining, which includes text categorization, text clustering, association analysis and prediction of trends, can assist us in discovering useful information in an effective and efficient manner. In this chapter, we have proposed a Web mining system that incorporates both online efficiency and off-line effectiveness to provide the “right” information based on users’ preferences. A Bi- Objective Fuzzy c-Means algorithm and information retrieval technique, for text categorization, clustering and integration, was employed for analysis. The proposed system is illustrated via a case involving the Web site marketing of mobile phones. A variety of Web sites exist on the Internet and a common type involves the trading of goods. In this type of Web site, the question to ask is: If we want to establish a Web site that provides information about products, how can we respond quickly and accurately to queries? This is equivalent to asking: How can we design a flexible search engine according to users’ preferences? In this study, we have applied data mining techniques to cope with such problems, by proposing, as an example, a Web site providing information on mobile phones in Taiwan. In order to efficiently provide useful information, two tasks were considered during the Web design phase. One related to off-line analysis: this was done by first carrying out a survey of frequent Web users, students between 15 and 40 years of age, regarding their preferences, so that Web customers’ behavior could be characterized. Then the survey data, as well as the products offered, were classified into different demand and preference groups. The other task was related to online query: this was done through the application of an information retrieval technique that responded to users’ queries. Based on the ideas above the remainder of the chapter is organized as follows: first, we present a literature review, introduce some concepts and review existing methods relevant to our study, then, the proposed Web mining system is presented, a case study of a mobile-phone marketing Web site is illustrated and finally, a summary and conclusions are offered.

Download Full-text

Web Mining to Identify People of Similar Background

Handbook of Research on Text and Web Mining Technologies ◽

10.4018/978-1-59904-990-8.ch023 ◽

2010 ◽

pp. 369-385

Author(s):

Quanzhi Li ◽

Yi-fang Brook Wu

Keyword(s):

Web Site ◽

Web Mining ◽

Experimental Results ◽

Web Pages ◽

Major Research ◽

Research Issues ◽

New Approach ◽

Representation Method ◽

Textual Content ◽

The Web

This chapter presents a new approach of mining the Web to identify people of similar background. To find similar people from the Web for a given person, two major research issues are person representation and matching persons. In this chapter, a person representation method which uses a person’s personal Web site to represent this person’s background is proposed. Based on this person representation method, the main proposed algorithm integrates textual content and hyperlink information of all the Web pages belonging to a personal Web site to represent a person and match persons. Other algorithms are also explored and compared to the main proposed algorithm. The evaluation methods and experimental results are presented.

Download Full-text

Web Mining System for Mobile-Phone Marketing

Business Applications and Computational Intelligence ◽

10.4018/978-1-59140-702-7.ch007 ◽

2011 ◽

pp. 113-130

Author(s):

Miao-Ling Wang ◽

Hsiao-Fan Wang

Keyword(s):

Information Retrieval ◽

Mobile Phone ◽

Mobile Phones ◽

Web Site ◽

Text Categorization ◽

Web Mining ◽

Mining System ◽

Retrieval Technique ◽

Web Text Mining ◽

The Web

With the ever-increasing and ever-changing flow of information available on the Web, information analysis has never been more important. Web text mining, which includes text categorization, text clustering, association analysis and prediction of trends, can assist us in discovering useful information in an effective and efficient manner. In this chapter, we have proposed a Web mining system that incorporates both online efficiency and off-line effectiveness to provide the “right” information based on users’ preferences. A Bi-Objective Fuzzy c-Means algorithm and information retrieval technique, for text categorization, clustering and integration, was employed for analysis. The proposed system is illustrated via a case involving the Web site marketing of mobile phones. A variety of Web sites exist on the Internet and a common type involves the trading of goods. In this type of Web site, the question to ask is: If we want to establish a Web site that provides information about products, how can we respond quickly and accurately to queries? This is equivalent to asking: How can we design a flexible search engine according to users’ preferences? In this study, we have applied data mining techniques to cope with such problems, by proposing, as an example, a Web site providing information on mobile phones in Taiwan. In order to efficiently provide useful information, two tasks were considered during the Web design phase. One related to off-line analysis: this was done by first carrying out a survey of frequent Web users, students between 15 and 40 years of age, regarding their preferences, so that Web customers’ behavior could be characterized. Then the survey data, as well as the products offered, were classified into different demand and preference groups. The other task was related to online query: this was done through the application of an information retrieval technique that responded to users’ queries. Based on the ideas above the remainder of the chapter is organized as follows: first, we present a literature review, introduce some concepts and review existing methods relevant to our study, then, the proposed Web mining system is presented, a case study of a mobile-phone marketing Web site is illustrated and finally, a summary and conclusions are offered.

Download Full-text