A Multi-Methodological Approach to Rare Association Rule Mining
Rare association rule mining has received a great deal of attention in the past few years. In this chapter, the authors propose a multi methodological approach to the problem of rare association rule mining that integrates three different strands of research in this area. Firstly, the authors make use of statistical techniques such as the Fisher test to determine whether itemsets co-occur by chance or not. Secondly, they use clustering as a pre-processing technique to improve the quality of the rare rules generated. Their third strategy is to weigh itemsets to ensure upward closure, thus checking unbounded growth of the rule base. Their results show that clustering isolates heterogeneous segments from each other, thus promoting the discovery of rules which would otherwise remain undiscovered. Likewise, the use of itemset weighting tends to improve rule quality by promoting the generation of rules with rarer itemsets that would otherwise not be possible with a simple weighting scheme that assigns an equal weight to all possible itemsets. The use of clustering enabled us to study in detail an important sub-class of rare rules, which we term absolute rare rules. Absolute rare rules are those are not just rare to the dataset as a whole but are also rare to the cluster from which they are derived.