A New Distributed Mining Association Rules Algorithm in Distributed Database System
The traditional algorithms create the local candidate sets firstly, and then determine whether the local frequent item sets is the global frequent item sets by the traffic between the nodes. The most different between the proposed algorithm and the traditional algorithms is that it firstly generates all the local frequent item sets at each node and then communicates to the top point. At the top point, there are four cases to deal with all the local frequent item sets. For the fastest case, the determination could be made by the completion of round-trip communications. At the same time, all the operations of this algorithm are completed not by traditional data storage but by a new data storage in which the item is considered as the keyword. The method can save storage space, especially for sparse data. So the support can be calculated by the intersection of the transaction sets, which is much easier than by accessing to the database. Finally, the association rules in the distributed database are mined