Mining fuzzy association rules with 2-tuple linguistic terms in stock market data by using genetic algorithm

Since transactions may contain quantitative values, many approaches have been proposed to derive membership functions for mining fuzzy association rules using genetic algorithms (GAs), a process known as genetic-fuzzy data mining. However, existing approaches assume that the number of linguistic terms is predefined. Thus, this study proposes a genetic-fuzzy mining approach for extracting an appropriate number of linguistic terms and their membership functions used in fuzzy data mining for the given items. The proposed algorithm adjusts membership functions using GAs and then uses them to fuzzify the quantitative transactions. Each individual in the population represents a possible set of membership functions for the items and is divided into two parts, control genes (CGs) and parametric genes (PGs). CGs are encoded into binary strings and used to determine whether membership functions are active. Each set of membership functions for an item is encoded as PGs with real-number schema. In addition, seven fitness functions are proposed, each of which is used to evaluate the goodness of the obtained membership functions and used as the evolutionary criteria in GA. After the GA process terminates, a better set of association rules with a suitable set of membership functions is obtained. Experiments are made to show the effectiveness of the proposed approach.

Download Full-text

InterTARM: FP-tree based Framework for Mining Inter-transaction Association Rules from Stock Market Data

2008 International Conference on Computer Science and Information Technology ◽

10.1109/iccsit.2008.173 ◽

2008 ◽

Author(s):

Hitesh Chhinkaniwala ◽

P. Santhi Thilagam

Keyword(s):

Stock Market ◽

Association Rules ◽

Market Data

Download Full-text

Genetic algorithm based framework for mining fuzzy association rules

Fuzzy Sets and Systems ◽

10.1016/j.fss.2004.09.014 ◽

2005 ◽

Vol 152 (3) ◽

pp. 587-601 ◽

Cited By ~ 83

Author(s):

M. Kaya ◽

R. Alhajj

Keyword(s):

Genetic Algorithm ◽

Association Rules ◽

Fuzzy Association Rules

Download Full-text

Mining Fuzzy Association Rules in Quantitative Databases

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.182-183.2003 ◽

2012 ◽

Vol 182-183 ◽

pp. 2003-2007

Author(s):

Yi Ming Bai ◽

Xian Yao Meng ◽

Xin Jie Han

Keyword(s):

Data Mining ◽

Association Rules ◽

Fuzzy Rule ◽

Real Data ◽

Rule Base ◽

Linguistic Terms ◽

Fuzzy Association Rules ◽

Rule Number ◽

Discrete Values ◽

Definition Of

In this paper, we introduce a novel technique for mining fuzzy association rules in quantitative databases. Unlike other data mining techniques who can only discover association rules in discrete values, the algorithm reveals the relationships among different quantitative values by traversing through the partition grids and produces the corresponding Fuzzy Association Rules. Fuzzy Association Rules employs linguistic terms to represent the revealed regularities and exceptions in quantitative databases. After the fuzzy rule base is built, we utilize the definition of Support Degree in data mining to reduce the rule number and save the useful rules. Throughout this paper, we will use a set of real data from a wine database to demonstrate the ideas and test the models.

Download Full-text

A Distributed Algorithm for Mining Fuzzy Association Rules in Traditional Databases

Handbook of Research on Fuzzy Information Processing in Databases ◽

10.4018/978-1-59904-853-6.ch027 ◽

2011 ◽

pp. 685-705

Author(s):

Wai-Ho Au

Keyword(s):

Distributed System ◽

Association Rules ◽

Distributed Algorithm ◽

Data Sets ◽

Linguistic Variables ◽

Linguistic Terms ◽

Fuzzy Association Rules ◽

Interestingness Measure ◽

The Difference ◽

Fuzzy Association Rule

The mining of fuzzy association rules has been proposed in the literature recently. Many of the ensuing algorithms are developed to make use of only a single processor or machine. They can be further enhanced by taking advantage of the scalability of parallel or distributed computer systems. The increasing ability to collect data and the resulting huge data volume make the exploitation of parallel or distributed systems become more and more important to the success of fuzzy association rule mining algorithms. This chapter proposes a new distributed algorithm, called DFARM, for mining fuzzy association rules from very large databases. Unlike many existing algorithms that adopt the support-confidence framework such that an association is considered interesting if it satisfies some user-specified minimum percentage thresholds, DFARM embraces an objective measure to distinguish interesting associations from uninteresting ones. This measure is defined as a function of the difference in the actual and the expected number of tuples characterized by different linguistic variables (attributes) and linguistic terms (attribute values). Given a database, DFARM first divides it into several horizontal partitions and assigns them to different sites in a distributed system. It then has each site scan its own database partition to obtain the number of tuples characterized by different linguistic variables and linguistic terms (i.e., the local counts), and exchange the local counts with all the other sites to find the global counts. Based on the global counts, the values of the interestingness measure are computed, and the sites can uncover interesting associations. By repeating this process of counting, exchanging counts, and calculating the interestingness measure, it unveils the underlying interesting associations hidden in the data. We implemented DFARM in a distributed system and used a popular benchmark data set to evaluate its performance. The results show that it has very good size-up, speedup, and scale-up performance. We also evaluated the effectiveness of the proposed interestingness measure on two synthetic data sets. The experimental results show that it is very effective in differentiating between interesting and uninteresting associations.

Download Full-text

A Distributed Algorithm for Mining Fuzzy Association Rules in Traditional Databases

Database Technologies ◽

10.4018/978-1-60566-058-5.ch148 ◽

2009 ◽

pp. 2427-2447

Author(s):

Wai-Ho Au

Keyword(s):

Distributed System ◽

Association Rules ◽

Distributed Algorithm ◽

Data Sets ◽

Linguistic Variables ◽

Linguistic Terms ◽

Fuzzy Association Rules ◽

Interestingness Measure ◽

The Difference ◽

Fuzzy Association Rule

The mining of fuzzy association rules has been proposed in the literature recently. Many of the ensuing algorithms are developed to make use of only a single processor or machine. They can be further enhanced by taking advantage of the scalability of parallel or distributed computer systems. The increasing ability to collect data and the resulting huge data volume make the exploitation of parallel or distributed systems become more and more important to the success of fuzzy association rule mining algorithms. This chapter proposes a new distributed algorithm, called DFARM, for mining fuzzy association rules from very large databases. Unlike many existing algorithms that adopt the support-confidence framework such that an association is considered interesting if it satisfies some user-specified minimum percentage thresholds, DFARM embraces an objective measure to distinguish interesting associations from uninteresting ones. This measure is defined as a function of the difference in the actual and the expected number of tuples characterized by different linguistic variables (attributes) and linguistic terms (attribute values). Given a database, DFARM first divides it into several horizontal partitions and assigns them to different sites in a distributed system. It then has each site scan its own database partition to obtain the number of tuples characterized by different linguistic variables and linguistic terms (i.e., the local counts), and exchange the local counts with all the other sites to find the global counts. Based on the global counts, the values of the interestingness measure are computed, and the sites can uncover interesting associations. By repeating this process of counting, exchanging counts, and calculating the interestingness measure, it unveils the underlying interesting associations hidden in the data. We implemented DFARM in a distributed system and used a popular benchmark data set to evaluate its performance. The results show that it has very good size-up, speedup, and scale-up performance. We also evaluated the effectiveness of the proposed interestingness measure on two synthetic data sets. The experimental results show that it is very effective in differentiating between interesting and uninteresting associations.

Download Full-text