A Discrete Artificial Bees Colony Inspired Biclustering Algorithm

2012 ◽  
Vol 3 (1) ◽  
pp. 30-42 ◽  
Author(s):  
R. Rathipriya ◽  
K. Thangavel

Biclustering methods are the potential data mining technique that has been suggested to identify local patterns in the data. Biclustering algorithms are used for mining the web usage data which can determine a group of users which are correlated under a subset of pages of a web site. Recently, many blistering methods based on meta-heuristics have been proposed. Most use the Mean Squared Residue as merit function but interesting and relevant patterns such as shifting and scaling patterns may not be detected using this measure. However, it is important to discover this type of pattern since commonly the web users can present a similar behavior although their interest levels vary in different ranges or magnitudes. In this paper a new correlation based fitness function is designed to extract shifting and scaling browsing patterns. The proposed work uses a discrete version of Artificial Bee Colony optimization algorithm for biclustering of web usage data to produce optimal biclusters (i.e., highly correlated biclusters). It’s demonstrated on real dataset and its results show that proposed approach can find significant biclusters of high quality and has better convergence performance than Binary Particle Swarm Optimization (BPSO).

Author(s):  
SUPRIYA KUMAR DE ◽  
P. RADHA KRISHNA

Clustering of data in a large dimension space is of great interest in many data mining applications. In this paper, we propose a method for clustering of web usage data in a high-dimensional space based on a concept hierarchy model. In this method, the relationship present in the web usage data are mapped into a fuzzy proximity relation of user transactions. We also described an approach to present the preference set of URLs to a new user transaction based on the match score with the clusters. The study demonstrates that our approach is general and effective for mining the web data for web personalization.


Author(s):  
Yannis Marinakis ◽  
Magdalene Marinaki ◽  
Nikolaos Matsatsinis ◽  
Constantin Zopounidis

Nature-inspired methods are used in various fields for solving a number of problems. This study uses a nature-inspired method, artificial bee colony optimization that is based on the foraging behaviour of bees, for a financial classification problem. Financial decisions are often based on classification models, which are used to assign a set of observations into predefined groups. One important step toward the development of accurate financial classification models involves the selection of the appropriate independent variables (features) that are relevant to the problem. The proposed method uses a discrete version of the artificial bee colony algorithm for the feature selection step while nearest neighbour based classifiers are used for the classification step. The performance of the method is tested using various benchmark datasets from UCI Machine Learning Repository and in a financial classification task involving credit risk assessment. Its results are compared with the results of other nature-inspired methods.


2008 ◽  
pp. 2004-2021
Author(s):  
Jenq-Foung Yao ◽  
Yongqiao Xiao

Web usage mining is to discover useful patterns in the web usage data, and the patterns provide useful information about the user’s browsing behavior. This chapter examines different types of web usage traversal patterns and the related techniques used to uncover them, including Association Rules, Sequential Patterns, Frequent Episodes, Maximal Frequent Forward Sequences, and Maximal Frequent Sequences. As a necessary step for pattern discovery, the preprocessing of the web logs is described. Some important issues, such as privacy, sessionization, are raised, and the possible solutions are also discussed.


2004 ◽  
pp. 335-358 ◽  
Author(s):  
Yongqiao Xiao ◽  
Jenq-Foung (J.F.) Yao

Web usage mining is to discover useful patterns in the web usage data, and the patterns provide useful information about the user’s browsing behavior. This chapter examines different types of web usage traversal patterns and the related techniques used to uncover them, including Association Rules, Sequential Patterns, Frequent Episodes, Maximal Frequent Forward Sequences, and Maximal Frequent Sequences. As a necessary step for pattern discovery, the preprocessing of the web logs is described. Some important issues, such as privacy, sessionization, are raised, and the possible solutions are also discussed.


Author(s):  
JIA HU ◽  
NING ZHONG

In a commercial website or portal, Web information fusion is usually from the following two approaches, one is to integrate the Web content, structure, and usage data for surfing behavior analysis; the other is to integrate Web usage data with traditional customer, product, and transaction data for purchasing behavior analysis. In this paper, we propose a unified model based on Web farming technology for collecting clickstream logs in the whole user interaction process. We emphasize that collecting clickstream logs at the application layer will help to seamlessly integrate Web usage data with other customer-related data sources. In this paper, we extend the Web log standard to modeling clickstream format and Web mining to Web farming from passively collecting data and analyzing the customer behavior to actively influence the customer's decision making. The proposed model can be developed as a common plugin for most existing commercial websites and portals.


2019 ◽  
Vol 8 (3) ◽  
pp. 3881-3886

Phasor Measurement Unit (PMU) being expensive and to be placed optimally, a meta-heuristic approach of Binary particle swarm optimization (BPSO) and Binary artificial bee colony optimization (BABC) is made for the optimal allocation of PMU in a power system. The PMU locations resulted are served by basic system conditions like network configuration, critical generators, and loads. The pattern of locations on including Zero-Injection Buess (ZIB) is also discussed. The redundancy in case of PMU loss is coined so as to obtain a complete observability of the power system. the channel limitations of device is also taken into consideration for better results in real-time systems. Optimal PMU locations for IEEE 30-bus and 14-bus systems with channel limits are compared with all above considerations. The number of PMU locations is reduced as channel limits increases. The simulated PMU locations are decreased with improved observability by Binary Artificial Bee Colony Optimization as compared to Binary Particle Swarm Optimization.


Author(s):  
R. Rathipriya ◽  
K. Thangavel ◽  
J. Bagyamani

Data mining extracts hidden information from a database that the user did not know existed. Biclustering is one of the data mining technique which helps marketing user to target marketing campaigns more accurately and to align campaigns more closely with the needs, wants, and attitudes of customers and prospects. The biclustering results can be tuned to find users’ browsing patterns relevant to current business problems. This paper presents a new application of biclustering to web usage data using a combination of heuristics and meta-heuristics algorithms. Two-way K-means clustering is used to generate the seeds from preprocessed web usage data, Greedy Heuristic is used iteratively to refine a set of seeds, which is fast but often yield local optimal solutions. In this paper, Genetic Algorithm is used as a global optimizer that can be coupled with greedy method to identify the global optimal target user groups based on their coherent browsing pattern. The performance of the proposed work is evaluated by conducting experiment on the msnbc, a clickstream dataset from UCI repository. Results show that the proposed work performs well in extracting optimal target users groups from the web usage data which can be used for focalized marketing campaigns.


2019 ◽  
Vol 8 (S3) ◽  
pp. 12-15
Author(s):  
B. Harika ◽  
T. Sudha

Information on internet increases rapidly from day to day and the usage of the web also increases, thus there is the need to discover interesting patterns from web. The process used to extract and mine useful information from web documents by using Data Mining Techniques is called Web Mining. Web Mining is broadly classified in to three types namely Web Content Mining, Web Structure Mining and Web Usage Mining. In this paper our focus is mainly on Web Usage Mining, where we are applying the data mining techniques to analyse and discover interesting knowledge from the Web Usage data. The activities of the user are captured and stored at different levels such as server level, proxy level and user level called as Web Usage Data and the usage data stored at server side is Web Server Log, where it records the browsing behavior of users and their requests based on the user clicks. Web server Log is a primary source to perform Web Usage Mining. This paper also brings in to discussion of various existing pre-processing techniques and analysis of web log files and how clustering is applied to group the users based on the browsing behavior of users on their interested contents.


2011 ◽  
Vol 2 (1) ◽  
pp. 1-17 ◽  
Author(s):  
Yannis Marinakis ◽  
Magdalene Marinaki ◽  
Nikolaos Matsatsinis ◽  
Constantin Zopounidis

Nature-inspired methods are used in various fields for solving a number of problems. This study uses a nature-inspired method, artificial bee colony optimization that is based on the foraging behaviour of bees, for a financial classification problem. Financial decisions are often based on classification models, which are used to assign a set of observations into predefined groups. One important step toward the development of accurate financial classification models involves the selection of the appropriate independent variables (features) that are relevant to the problem. The proposed method uses a discrete version of the artificial bee colony algorithm for the feature selection step while nearest neighbour based classifiers are used for the classification step. The performance of the method is tested using various benchmark datasets from UCI Machine Learning Repository and in a financial classification task involving credit risk assessment. Its results are compared with the results of other nature-inspired methods.


Author(s):  
R. Rathipriya ◽  
K. Thangavel ◽  
J. Bagyamani

Biclustering has the potential to make significant contributions in the fields of information retrieval, web mining, and so forth. In this paper, the authors analyze the complex association between users and pages of a web site by using a biclustering algorithm. This method automatically identifies the groups of users that show similar browsing patterns under a specific subset of the pages. In this paper, mutation operator from Genetic Algorithms is incorporated into the Binary Particle Swarm Optimization (BPSO) for biclustering of web usage data. This hybridization can increase the diversity of the population and help the particles effectively escape from the local optimum. It detects optimized user profile group according to coherent browsing behavior. Experiments are performed on a benchmark clickstream dataset to test the effectiveness of the proposed algorithm. The results show that the proposed algorithm has higher performance than existing PSO methods. The interpretation of this biclustering results are useful for marketing and sales strategies.


Sign in / Sign up

Export Citation Format

Share Document